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Abstract. The theory of regular cost functions is a quantitative extension to the classical 
notion of regularity. A cost function associates to each input a non-negative integer value 
(or infinity), as opposed to languages which only associate to each input the two values 
"inside" and "outside" . This theory is a continuation of the works on distance automata 
and similar models. These models of automata have been successfully used for solving the 
star-height problem, the finite power property, the finite substitution problem, the relative 
inclusion star-height problem and the boundedness problem for monadic-second order logic 
over words. Our notion of regularity can be - as in the classical theory of regular languages 
- equivalently defined in terms of automata, expressions, algebraic recognisability, and by 
a variant of the monadic second-order logic. These equivalences are strict extensions of 
the corresponding classical results. 

The present paper introduces the cost monadic logic, the quantitative extension to the 
notion of monadic second-order logic we use, and show that some problems of existence 
of bounds are decidable for this logic. This is achieved by introducing the corresponding 
algebraic formalism: stabilisation monoids. 



1. Introduction 

This paper introduces and studies a quantitative extension to the standard theory of regular 
languages of words. It is the only quantitative extension (in which quantitative means that 
the function described can take infinitely many values) known to the author in which the 
milestone equivalence for regular languages: 

accepted by automata = recognisable by monoids 
= definable in monadic second-order logic = definable by regular expressions 
can be faithfully extended. 

This theory is developed in several papers. The objective of the present one is the 
introduction of the logical formalism, and its resolution using algebraic tools. However, in 
this introduction, we try to give a broader panorama. 
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1.1. Related works. The theory of regular cost functions involves the use of automata 
(called B- and S-automata), algebraic structures (called stabilisation monoids), a logic 
(called cost monadic logic), and suitable regular expressions (called B- and S-regular ex- 
pressions). All these models happen to be of same expressiveness. Though most of these 
concepts are new, some are very close to objects known from the literature. As such, the 
present work is the continuation of several branches of research. 

The general idea behind these works is that we want to represent functions, i.e., quan- 
titative variants of languages, and that, ideally we want to keep strong decision results. 
Works related to cost functions go in this direction, where the quantitative notion is the 
ability to count, and the decidability results are concerned with the existence/non-existence 
of bounds. 

A prominent question in this theory is the star-height problem. This story begins in 
1963 when Eggan formulates the star-height decision problem [T3j : 

Input: A regular language of words L and a non- negative integer k. 
Output: Yes, if there exists a regular expressiorj^ using at most k nesting of Kleene 
stars which defines L. No, otherwise. 
Eggan proved that the hierarchy induced by k does not collapse, but the decision problem 
itself was quickly considered as central in language theory, and as the most difficult problem 
in the area. 

Though some partial results were obtained by McNaughton, Dejean and Schiitzenberger 
[361 112] , it took twenty- five years before Hashiguchi came up with a proof of decidability 
spread over four papers [181 HZl HI 1201 • This proof is notoriously difficult, and no clean 
exposition of it has ever been presented. 

Hashiguchi used in his proof the model of distance automata. A distance automaton is a 
finite state non-deterministic automaton running over words which can count the number of 
occurrences of some "special" states. Such an automaton associates to each word a natural 
number, which is the least number of occurrences of special states among all the accepting 
runs (or nothing if there is no accepting run over this input). The proof of Hashiguchi relies 
on a very difficult reduction to the following limitedness problem: 
Input: A distance automaton. 

Output: Yes, if the automaton is limited, i.e., if the function it computes is bounded 
over its domain. No, otherwise. 
Hashiguchi established the decidability of this problem [TTj. The notion of distance au- 
tomata and its relationship with the tropical semiring (distance automata can be seen as 
automata over the tropical semiring, i.e., the semiring (N U {oo},min, +)) has been the 
source of many investigations [l8l[2ll[23l[Hl[35lll0lllIl[l3lll7lll8]. 

Despite this research, the star-height problem itself remained not so well understood for 
seventeen more years. In 2005, Kirsten gave a much simpler and self-contained proof [26j. 
The principle is to use a reduction to the limitedness problem for a form of automata more 
general than distance automata, called nested distance desert automata. To understand this 
extension, let us first look again at distance automata: we can see a distance automaton 
as an automaton that has a counter which is incremented each time a "special" state is 
encountered. The value attached to a word by such an automaton is the minimum over 

"'^ Regular expressions are built on top of letters using the language operation of concatenation, union, and 
Kleene star. This problem is sometime referred to as the restricted star-height problem, while the version 
also allowing complement is the generalised star-height problem, and has a very different status. 
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all accepting runs of the maximal value assumed by the counter. Presented like this, a 
nested distance desert automaton is nothing but a distance automaton in which multiple 
counters and reset of the counters are allowed (with a certain constraint of nesting of 
counters). Kirsten performed a reduction of the star- height problem to the limitedness of 
nested distance desert automata which is much easier than the reduction of Hashiguchi. He 
also proves that the limitedness problem of nested distance desert automata is decidable. 
For this, he generalises the proof methods developed previously by Hashiguchi, Simon and 
Leung for distance automata. This work closes the story of the star-height problem itself. 

The star-height problem is the king among the problems solved using this method. 
But there are many other (difficult) questions that can be reduced to the limitedness of 
distance automata and variants. Some of the solutions to these problems paved the way to 
the solution of the star-height problem. 

The finite power property takes as input a regular language L and asks whether there 
exists some positive integer n such that (L + e)" = L* . It was raised by Brzozowski in 
1966, and it took twelve years before being independently solved by Simon and Hashiguchi 
|4m I16| . This problem is easily reduced to the limitedness problem for distance automata. 

The finite substitution problem takes as input two regular languages L, K, and asks 
whether it is possible to find a finite substitution a {i.e., a morphism mapping each letter 
of the alphabet of L to a finite language over the alphabet of K) such that cr(L) = K. 
This problem was shown decidable independently by Bala and Kirsten by a reduction to 
the limitedness of desert automata (a form of automata weaker than nested distance desert 
automata, but incomparable to distance automata) , and a proof of decidability of this latter 
problem [21 [25]. 

The relative inclusion star-height problem is an extension of the star height problem 
introduced and shown decidable by Hashiguchi using his techniques |22j . Still using nested 
distance desert automata, Kirsten gave another, more elegant proof of this result |29| . 

The boundedness problem is a problem of model theory. It consists of deciding if there 
exists a bound on the number of iterations that are necessary for the fixpoint of a logical 
formula to be reached. The existence of a bound means that the fixpoint can be eliminated 
by unfolding its definition sufficiently many times. The boundedness problem is usually 
parameterised by the logic chosen and by the class of models over which the formula is 
studied. The boundedness problem for monadic second-order formulae over the class of 
finite words was solved by a reduction to the limitedness problem of distance automata by 
Blumensath, Otto and Weyer [3]. 

One can also cite applications of distance automata in speech recognition |37| [38], 
databases p^, and image compression ^24j. In the context of verification, Abdulla, Krcal 
and Yi have introduced R-automata, which correspond to nested distance desert automata 
in which the nesting of counters is not required anymore [1] . They prove the decidability of 
the limitedness problem for this model of automata. 

Finally, Loding and the author have also pursued this branch of researches in the 
direction of extended models. In [9], the star-height problem over trees has been solved, by 
a reduction to the limitedness problem of nested distance desert automata over trees. The 
latter problem was shown decidable in the more general case of alternating automata. In [IH] 
a similar attempt has been tried for deciding the Mostowski hierarchy of non-deterministic 
automata over infinite trees (the hierarchy induced by the alternation of fixpoints). The 
authors show that it is possible to reduce this problem to the limitedness problem for a 
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form of automata that unifies nested distance desert automata and parity tree automata. 
The latter problem is an important open question. 

Bojahczyk and the author have introduced the notion of B-automata in [5], a model 
which resembles much (and is prior to) R-automata. The context was to show the decid- 
ability of some fragments of the logic MSO+U over infinite words, in which MSO+U is the 
extension of the monadic second order logic extended with the quantifier iJX.ip meaning 
"for all integers n, there exists a set X of cardinality at least n such that if holds" . From the 
decidability results in this work, it is possible to derive every other limitedness results over 
finite words. However, the constructions are complicated and of non-elementary complexity. 
Nevertheless, the new notion of S-automata was introduced, a model dual to B-automata. 
Recall that the semantics of distance automata and their variants can be expressed as a 
minimum over all runs of the maximum of the value taken by counters. The semantics 
of S-automata is dual: it is defined as the maximum over all runs of the minimum of the 
value taken by the counters at the moment of their reset. Unfortunately, it is quite hard 
to compare in detail this work with all others. Indeed, since it was oriented toward the 
study of a logic over infinite words, the central automata are in fact a;B and (x>S-automata: 
automata accepting languages of infinite words that have an infinitary accepting condition 
constraining the asymptotic behaviour of the counters along the run. This makes these 
automata very differenlj^ Indeed, the automata in [5] accept languages while the automata 
in study here define functions. For achieving this, the automata use an extra mechanism 
involving the asymptotic behaviors of counters for deciding whether an infinite word should 
be accepted or not. This extra mechanism has no equivalent in distance automata, and is 
in some sense "orthogonal" to the machinery involved in cost functions. For this reason, 
B-automata and S-automata in [5] are just intermediate objects that do not have all the 
properties we would like. In particular B- and S-automata in [5j are not equivalent. However, 
the principle of using two dual forms of automata is an important concept in the theory 
of regular cost functions. The study of MSO+U has been pursued in several directions. 
Indeed, the general problem of the satisfaction of MSO+U is a challenging open problem. 
One partial result concerns the decision of WMSO+U (the weak fragment in which only 
quantifiers over finite sets are allowed) which is decidable However, the techniques 
involved in this work are not directly related to cost functions. 

The proof methods for showing the decidability of the limitedness problem of distance 
automata and their variants, are also of much interest by themselves. While the original 
proof of Hashiguchi is quite complex, a major advance has been achieved by Leung who 
introduced the notion of stabilisation |32l [33] (see also ^T] for an early overview). The 
principle is to abstract the behaviour of the distance automaton in a monoid, and further 
describe the semantics of the counter using an operator of stabilisation, i.e., an operator 
which describes, given an element of the monoid, what would be the effect of iterating it 
a "lot of times". This key idea was further used and refined by Simon, Leung, Kirsten, 
Abdulla, Krcal and Yi. This idea was not present in [5], and this is one explanation for the 
bad complexity of the constructions. 

Another theory related to cost functions is the one developed by Szymon Toruiiczyk 
in his thesis [46j. The author proposes a notion of recognisable languages of profinite 
words which happen to be equivalent to cost functions. Indeed, profinite words are infinite 

o 

One must be careful: these automata are not related to B and S-automata as, say, Biichi automata are 
related to automata over finite words. We warn the reader that these models cannot be thought as the 
extension of cost functions to infinite words. 
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sequences of finite words (which are convergent in a precise topology, the profinite topology) . 
As such, a single profinite word can be used as a witness that a function is not bounded. 
Following the principle of this correspondence, one can see a cost function as a set of 
profinite words: the profinite words corresponding to infinite sequences of words over which 
the function is bounded. This correspondence makes Toruhcyk's approach equi-expressive 
with cost functions over finite words as far as decision questions are concerned. Seen like 
this, this approach can be seen as the theory of cost functions presented in a more abstract 
setting. Still, some differences have to be underlined. On one side, the profinite approach, 
being more abstract, loses some precision. For instance in the present work, we have a good 
understanding of the precision of the constructions: namely each operation can be performed 
doing an at most "polynomial approximatiorj^' . On the other side, the presentation in terms 
of profinite languages eliminates the corresponding annoying details in the development of 
cost functions: namely there is no more need to control the approximation at each step. 
Another interesting point is that the profinite presentation points naturally to extensions, 
which are orthogonal to cost functions, and are highly related to MSO+U. For the moment, 
the profinite approach has been developed for finite words only. It is not clear for now how 
easy this abstract presentation can be used for treating more complex models, as it has 
been done for cost functions, e.g., over finite trees [11]. 

1.2. Survey of the theory. The theory of regular cost functions gives a unified and general 
framework for explaining all objects, results and constructions presented above (apart from 
the results in ^ that are of a slightly different nature). It also allows to derive new results. 

Let us describe the contributions in more details. 
Cost functions. The standard notion of language is replaced by the new notion of cost 
function. For this, we consider mappings from a set to N U {oo} (in practice E is the set 
of finite words over some finite alphabet) and the equivalence relation ~ defined by / ~ (7 
if: 

for all X Q E, f restricted to X is bounded iff g restricted to X is bounded. 
Hence two functions are equivalent if it is not possible to distinguish them using arguments 
of existence of bounds. A cost function is an equivalence class for ~. The notion of cost 
functions is what we use as a quantitative extension to languages. Indeed, every language L 
can be identified with (the equivalence class of) the function mapping words in L to the 
value 0, and words outside L to 00. All the theory is presented in terms of cost functions. 
This means that all equivalences are considered modulo the relation w. 

Cost automata. A first way to define regular cost functions is to use cost automata, which 
come in two flavours, B- and S-automata. The B-automata correspond in their simple 
form to R-automata [1] and in their simple and hierarchical form to nested distance desert 
automata in [271 128j . Those are also very close to B-automata in f5]. Following the ideas 
in [5], we also use the dual variant of S-automata. The two forms of automata, B-automata 
and S-automata, are equi-expressive in all their variants, an equivalence that we call the 
duality theorem. Automata are not introduced in this paper. 

■^The notion of approximation may be misleading: the results are exact, but, since we are only interested 
in boundedness questions, we allow ourselves to perform some harmless distortions of the functions. This 
distortion is measured by an approximation parameter called the correction function. 
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Stabilisation monoids. The corresponding algebraic characterisation makes use of the new 
notion of stabiUsation monoids. A stabihsation monoid is a finite ordered monoid together 
with a stabihsation operation. This stabihsation operation expresses what it means to 
iterate "a lot of times" some element. The operator of stabilisation was introduced by Leung 
|32^ [33] and used also by Simon, Kirsten, Abdulla, Krcal and Yi as a tool for analysing the 
behaviour of distance automata and their variants. The novelty here lies in the fact that in 
our case, stabilisation is now part of the definition of a stabilisation monoid. We prove that 
it is possible to associate unique semantics to all stabilisation monoids. These semantics 
are represented by means of computations. A computation is an object describing how a 
word consisting of elements of the stabilisation monoid can be evaluated into a value in the 
stabilisation monoid. This key result shows that the notion of stabilisation monoid has a 
"meaning" independent from the existence of cost automata (in the same way a monoid 
can be used for recognising a language, independently from the fact that it comes from a 
finite state automaton). This notion of computations is easier to handle than the notion of 
compatible mappings used in the conference version of this work |B] . 

Recognisable cost functions. We use stabilisation monoids for defining the new notion of 
recognisable cost functions. We show the closure of recognisable cost functions under min, 
max, and new operations called inf-projection and sup-projection (which are counterparts 
to projection in the theory of regular languages). We also prove that the relation ~ (in fact 
the correspoding preorder ^) is decidable over recognisable cost functions. This decidability 
result subsumes many limitedness results from the literature. This notion of recognisability 
for cost functions is equivalent to being accepted by the cost automata introduced above. 

Extension of regular expressions. It is possible to define two forms of expressions, B- and S- 
regular expressions, and show that these are equivalent to cost automata. These expressions 
were already introduced in |5j in which a similar result was established. 

Cost monadic logic. The cost monadic (second-order) logic is a quantitative extension to 
monadic (second-order) logic. It is for instance possible to define the diameter of a graph 
in cost monadic logic. The cost functions over words definable in this logic coincide with 
the regular cost functions presented above. This equivalence is essentially the consequence 
of the closure properties of regular cost functions (as in the case of regular languages), and 
no new ideas are required here. The interest lies in the logic itself. Of course, the decision 
procedure for recognisable cost function entails decidability results for cost monadic logic. 
In this paper, cost monadic logic is the starting point of our presentation, and our central 
decidability result is Theorem 2.1 stating the decidability of this logic. 



1.3. Content of this paper. This paper does not cover the whole theory of regular cost 
functions over words. The line followed in this paper is to start from the logic "cost monadic 
logic", and to introduce the necessary material for "solving it over words". This requires 
the complete development of the algebraic formalism. 

In Section [2| we introduce the new formalism of cost monadic logic, and show what is 
required to solve it. In particular, we introduce the notion of cost function, and advocate 
that it is useful to consider the logic under this view. We state there our main decision 



result. Theorem 2.1 
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In Section [3] we present the underlying algebraic structure: stabilisation monoids. We 
then introduce computations, and establish the key results of existence (Theorem 3.3) and 
uniqueness (Theorem 3.4) of the value computed by computations. 

In Section [4j we use stabilisation monoids for defining recognisable cost functions. We 
show various closure results for recognisable cost functions as well as decision procedures. 
Those results happen to fulfill the conditions required in Section [2] for showing the decid- 
ability of cost monadic logic over words. 

In Section [5] some arguments are given on the relationship with the models of automata, 
which are not described in this document, and on how these different notions interact in 
the big picture. 



2. Logic 

2.1. Cost monadic logic. Let us recall that monadic second-order logic (monadic logic 
for short) is the extension of first-order logic with the ability to quantify over sets (i.e., 
monadic relations). Formally monadic formulae use first-order variables (x,y, ...), and 
monadic variables {X, Y, . . .), and it is allowed in such formulae to quantify existentially 
and universally over both first-order and monadic variables, to use every boolean connective, 
to use the membership predicate (x G X), and every predicate of the relational structure. 
We expect from the reader basic knowledge concerning monadic logic. 

Example 2.0.1. The monadic formula reach(a;, y, X) over the signature containing the 
single binary predicate edge (signature of a digraph): 

reach(x, y, X) ::= x = y V VZ 

{x e Z A\/z,z' {z e Z A z' e X Aedge{z,z') z' e Z)) yeZ 

describes the existence of a path in a digraph from vertex x to vertex y such that all edges 
appearing in the path end in X. Indeed, it expresses that either the path is empty, or every 
sets Z containing x and closed under taking edges ending in X, also contains y. 

In cost monadic logic, one uses a single extra variable of a new kind, called the bound 
variable. It ranges over non-negative integers. Cost monadic logic is obtained from monadic 
logic by allowing the extra predicate |X| < - in which X is some monadic variable and 
N the bound variable - if and only if it appears positively in the formula {i.e., under the 
scope of an even number of negations). The semantic of |X| < is, as one may expect, to 
be satisfied if (the valuation of) X has cardinality at most (the valuation of) A^. Given a 
formula (/?, we denote by FV((/7) its free variables, the bound variable excluded. A formula 
that has no free-variables-it may still use the bound variable-is called a sentence. 

We now have to provide a meaning to the formulae of cost monadic logic. We assume 
some familiarity of the reader with logic terminology. A signature consists of a set of 
symbols R, S, . . . . To each symbol is attached a non-negative integer called its arity. A 
(relational) structure (over the above signature) S = {Us, , ■ ■ ■ , R^) consists of a set Us 
called the universe, and for each symbol R of arity n of a relation R^ C U^. Given 
a set of variables F, a valuation of F (over S) is a mapping v which to each monadic 
variable X G F associates a set v{X) C Us, and to each first-order variable x G F associates 
an element v(x) G Us- We denote hy v,X = E the valuation v in which X is further mapped 
to E. Given a cost monadic formula ip, a valuation v of its free variable over a structure S 
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and a non-negative integer n, we express by S,v,n |= 93 the fact that the formula 99 is 
satisfied over the structure S with valuation v when the variable A'^ takes the value n. Of 
course, if (p is simply a sentence, we just write S, n \= ip. We also omit the parameter n 
when (/? is a monadic formula. 

The positivity assumption required when using the predicate |X| < has straightfor- 
ward consequences. Namely, for all cost monadic sentences 99, all relational structures 5, 
and all valuations v, 5, v, n |= implies 5, v, m |= (/9 for all m > n. 

Instead of evaluating as true or false as done above, we see a formula of cost monadic 
logic ip of free variables F as associating to each relational structure S and each valuation v 
of the free variables a value in N U {00} defined by: 

M(5,v)=inf{n : S,Y,n\=ip}. 

This value can be either a non- negative integer, or 00 if no valuation of N makes the sentence 
true. In case of a sentence (p, we omit the valuation and simply write Iv'K'S). Let us stress 
the link with standard monadic logic in the following fact: 

Fact 2.0.2. For all monadic formula ip, and all relational structures S, 

'0 iiS\=ip 



00 otherwise . 



Example 2.0.3. The sentence \/X \X\ < N calculates the size of a structure. More formally 
{yX \X\< Nj{S) equals \Us\. 



A more interesting example makes use of Example 2.0.1 Again over the signature of 
digraphs, the cost monadic sentence: 

diameter ::= Vx, y 3X \X\ < N A reach(x, y, X). 

defines the diameter of the di-graph: indeed, the diameter of a graph is the least n such 
that for all pairs of states x, y, there exists a set of size at most n allowing to reach y from x 
(recall that in the definition of rea.ch{x,y, X), x does not necessarily belong to X, hence 
this is the diameter in the standard sense). 

From now on, for avoiding some irrelevant considerations, we will consider the variant of 
cost monadic logic in which a) only monadic variables are allowed, b) the inclusion relation 
X C y is allowed, and c) each relation over elements is raised to a relation over singleton 
sets. Keeping in mind that each element can be identified with the unique singleton set 
containing it, it is easy to translate cost monadic logic into this variant. In this presentation, 
it is also natural to see the inclusion relation as any other relation. We will also assume 
that the negations are pushed to the leaves of formulae as is usual. Overall a formula can 
be of one of the following forms: 

R{Xi,...Xn) I -^R{Xi,...Xn) I \X\<N I pA^p I pVip I 3X.p \ ^X.ip 

in which p and ip are formulas, R is some symbol of arity n which can possibly be C (of 
arity 2), and X, Xi, . . . , Xn are monadic variables. 

So far, we have described the semantic of cost monadic logic from the standard notion of 
model. There is another equivalent way to describe the meaning of formulae, by induction 
on the structure. The equations are disclosed in the following fact. 
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Fact 2.0.4. Over a structure S and a valuation v, the following equalities hold: 

[O if i?'5(v(Xi),...,v(X„)) 



[i?(Xi,...,X„)](5,v) 



hi?(Xi,...,X„)](5,v) 



oo otherwise 

'oo if i?'5(v(Xi),...,v(X„)) 
otherwise 



[|X|<iVl(5,v) = |v(X)| 

y V v) = min([<^](<S, v), v)) 

y A VI {S, v) = maxd^^l (5, v) , m {S, v)) 

13X ip}{S,Y) = M{M{S,Y,X = E) : ECUs} 

fix ip}{S,Y) = sup{M{S,Y,X = E) : EC Us} 

As it is the case for monadic logic, no property (if not trivial) is decidable for monadic 
logic in general. Since cost monadic logic is an extension of monadic logic, one cannot 
expect anything to be better in this framework. However we are interested, as in the 
standard setting, to decide properties over a restricted class C of structures. The class C 
can typically be the class of finite words, of finite trees, of infinite words (of length cj, or 
beyond) or of infinite trees. The subject of this paper is to consider the case of finite words 
over a fixed finite alphabet. 

We are interested in deciding properties concerning the function described by cost 
monadic formulae over C. But what kind of properties? It is quite easy to see that, given 
a cost monadic sentence ip and n G N, one can effectively produce a monadic formula 
such that for all structures S, S |= cp"' iff [97](5) = n (such a translation would be possible 
even without assuming the positivity requirement in the use of the predicates |X| < N). 
Hence, deciding questions of the form "[(yjj = n" can be reduced to the standard theory. 

Properties that cannot be reduced to the standard theory, and that we are interested 
in, involve the existence of bounds. One says below that a function / is bounded over some 
set X if there is some integer n such that f{x) < n for all x E X. We are interested in the 
following generic problems: 

Boundedness: Is the function bounded over C? 
Or (variant), is {(pj bounded over a regular subset of C? 
Or (limitedness), is [99] bounded over {S : 1^}{S) 7^ 00}? 
Divergence: For all n, do only finitely many 5 G C satisfy [<^1(5) < n? 

Said differently, are all sets over which {ip} is bounded of finite cardinality? 
Domination: For all C C, does {ip} bounded over E imply that {ip} is also bounded 
over E? 

All these questions cannot be reduced (at least simply) to questions in the standard theory. 
Furthermore, all these questions become undecidable for very standard reasons as soon as 
the requirement of positivity in the use of the new predicate \X\ < N is removed. In this 
paper, we introduce suitable material for proving their decidability over the class C of words. 

One easily sees that the domination question is in fact a joint extension of the bound- 
edness question (if one sets ip to be always true, i.e., to compute the constant function 0), 
and the divergence question (if one sets ^p to be measuring the size of the structure, i.e., 
yx \X\ < N). Let us remark finally that if is a formula of monadic logic, then the 
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boundedness question corresponds to deciding ii (p is a, tautology. If furthermore is also 

monadic, then the domination consists of deciding whether ip implies tp. 

In the following section, we introduce the notion of cost functions, i.e., equivalence 
classes over functions allowing to omit discrepancies of the function described, while pre- 
serving sufficient information for working with the above questions. 

2.2. Cost functions. In this section, we introduce the equivalence relation ~ over func- 
tions, and the central notion of cost function. 

A correction function a is a non-decreasing mapping from N to N such that a(n) > n for 
all n. From now on, the symbols a, a' . . . implicitly designate correction functions. Given 
X, y in NU {oo}, X U holds if a; < a{y) in which a. is the extension of a with a(oo) = oo. 
For every set E, is extended to (NU {cx)})^ in a natural way hy f 9 if f{x) =4a g{x) 
for all x G ii^, or equivalently f <ao g. Intuitively, / is dominated by g after it has been 
"stretched" by a. One also writes / ~a 5 if / =4a 9 and g =4a f • Finally, one writes f =4 9 
(resp. f K. g) \i f g (resp. / g) for some a. A cost function (over a set E) is an 
equivalence class of {i.e., a set of mappings from to N U {oo}). 

Some elementary properties of are: 

Fact 2.0.5. If a < a' and / g, then / g. li f 9 ^a' h, then / ^^oa' h. 

The above fact allows to work with a single correction function at a time. Indeed, as 
soon as two correction functions a and a' are involved in the same proof, we can consider 
the correction function a" = max(Q;, a'). By the above fact, it satisfies that / 9 
imphes / ^a" 9, and / 9 implies / ^q," 9- 

Example 2.0.6. Over N x N, maximum and sum are equivalent for the doubling correction 
function (for short, (max) (+))■ Indeed, for all x,y Eu, 

max(x, y)<x + y<2x max(x, y) . 

Our next examples concern mappings from sequences of words to N. We have 

I |a I I ) and I |a I Ifc ) 

where a and b are distinct letters, | | is the function mapping each word to its length and 
I \a the function mapping each word to the number of occurrences of the letter a it contains. 
Indeed we have | |a < | | but the set of words a* is a witness that | |a I \b cannot hold 
whatever is a. 

Given words ui, . . . ,Uk € {a, b}*, we have 

Ki • ■ ■ '^jfcla ~a max(|iir|, max |ui|a) where K = {i E {1, . . . ,k} : \ui\a > 1} 

i=l...k 

in which a is the squaring function. Indeed, for one direction we just have to remark: 

max(|if|, max \ui\a) < \ui . . . Uk\a, 

i=l...k 

and for the other direction we use: 

\ui ■ --Ukla < \ui\a < (niax(|ii:|, max l^ila))^ • 

' i=l...k 
ieK 

The relation =^ has other characterisations: 
Proposition 1. For all f,g from to N U {oo}, the following items are equivalent: 
(1) f^9, 
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(2) \/n £ N 3m e N \/x £ E g{x) < n f{x) < in , and; 

(3) for all X E, g\x is bounded implies f\x is bounded. 

Proof. From to Let us assume f ^ g, i-e., f 9 for some a. Let n be some 
non-negative integer, and m = a{n). We have for all that g{x) < n implies f{x) < 
(a o g){x) < a(n) = m, thus establishing the second statement. 

From to Let X Q E he such that g\x is bounded. Let n be a bound of g 
over X. Item Q states the existence of m such that Vx £ E g{x) < n — )• f{x) < m. In 
particular, for all x £ E, we have g{x) < n by choice of n, and hence f{x) < m. Hence f\x 
is bounded by m. 

From to Let n G N, consider the set X„ = {x : g{x) < n}. The map- 

ping g is bounded over Xn (by n), and hence by ([s]), / is also bounded. We set a{n) = 
max(n, sup Since X„ C Xn+i, the function a is non-decreasing. Since further- 
more a{n) > n, a is a correction function. Let now x G X. If g{x) < oo, we have 
that X £ Xg{x) by definition of the X's. Hence f{x) < sup/(Xg(^)) = a{g{x)). Other- 
wise g{x) = oo, and we have f{x) < a{g{x)) = oo. Hence / 9- □ 

The last characterisation shows that the relation ~ is an equivalence relation that 
preserves the existence of bounds. Indeed, all this theory can be seen as a method for 
proving the existence/non-existence of bounds. One can also remark that the questions of 
boundedness, divergence, and domination presented in the previous section, are preserved 
under replacing the semantic of a formula by an ~-equivalent function. Furthermore, the 
domination question can be simply reformulated as \(p\ )^ |^/^]. 

We conclude this section by some remarks on the structure of the ^ relation. Cost 
functions over some set E ordered by ^ form a lattice. Let us show how this lattice refines 
the lattice of subsets of E ordered by inclusion. The following elementary fact shows that 
we can identify a subset of E with the cost function of its characteristic function (given a 
subset X E, one denotes by xx its characteristic mapping defined by xxix) = if x G X, 
and oo otherwise): 

Fact 2.0.7. For ah X,Y O E, xx 4 Xy 'iS X ^ Y. 

In this respect, the lattice of cost functions is a refinement of the lattice of subsets of E 
equipped with the superset ordering. Let us show that this refinement is strict. Indeed, 
there is only one language L such that xl does not have oo in its range, namely L = E, 
however, we will show in Proposition [2] that, as soon as E is infinite, there are uncountably 
many cost functions which have this property of not using the value oo. 

Proposition 2. If is infinite, then there exist at least continuum many different cost 
functions from E to N. 

Proof. Without loss of generality, we can assume E countable, and even, up to bijection, 
that E = N \ {0}. Let po,Pi, ... be the sequence of all prime numbers. Every n £ E is 
decomposed in a unique way as Pi^p^^ ... in which all n^'s are null but finitely many (with 
an obvious meaning of the infinite product). For all / C N, one defines the function // 
from N \ {0} to N for ah n e N \ {0} by: 

fl{n) = maxjnj : i £ I , n = p'l'P^^ . . . } . 

Consider now two different sets /, J C N. This means — up to a possible exchange of the 
roles of / and J — that there exists i £ I\J . Consider now the set X = {p\ : A; G N}. Then, 
by construction, fi{p^) = k and hence // is not bounded over X. However, fj{pf) = and 
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hence fj is bounded over X. It follows by Proposition [T] that // and fj are not equivalent 
for ~. We can finally conclude that — since there exist continuum many subsets of N — there 
is at least continuum many cost functions over E which do not use value oo. D 



2.3. Solving cost monadic logic over words using cost functions. As usual, we see 
a word as a structure, the universe of which is the set of positions in the word (numbered 
from 1), equipped with the ordering relation <, and with a unary relation for each letter 
of the alphabet that we interpret as the set of positions at which the letter occur. Given 
a set of monadic variables F, and a valuation v of F over a word u = oi . . . £ A*, we 
denote by {u,v) the word ci . . . over the alphabet Ap = A x {0, l}'^ such that for all 
position i = 1 . . . k, Ci = {ai, 5i) in which Si maps X G F to 1 if i G v(X), and to otherwise. 
It is classical that given a monadic formula if with free variables F, the language 

L^ = {{u,v) : u,v\=ip}CA*p 

is regular. The proof is done by induction on the formula. It amounts to remark that 
to the constructions of the logic, namely disjunction, conjunction, negation and existential 
quantification, correspond naturally some language theoretic operations, namely union, 
intersection, complementation and projection. The base cases are obtained by remarking 
that the relations of ordering, inclusion, and letter, also correspond to regular languages. 

We use a similar approach. To each cost monadic formula ip with free variables F over 
the signature of words over A, we associate the cost function over Ap defined by 

We aim at solving cost monadic logic by providing an explicit representation to the cost 
functions f^. For reaching this goal, we need to define a family of cost functions J- that 
contains suitable constants, has effective closure properties and decision procedures. 

The first assumption we make is the closure under composition with a morphism. I.e., 
let / be a cost function in T over A* and /i be a morphism from B* (B being another 
alphabet) to A*, we require f o h to also belong to J-. In particular, this operation allows 
us to change the alphabet, and hence to add new variables when required. It corresponds 
to the closure under inverse morphism for regular languages. 

Fact 2.0.4 gives us a very precise idea of the constants we need. The constants corre- 
spond to the formulae of the form R{Xi, . . . as well as their negation. As mentioned 
above, for such a formula ip, is regular. Hence, it is sufficient for us to require that 
the characteristic function xl belongs to J- for each regular language L. The remaining 
constants correspond to the formula \X\ < N. We have that f\x\<Ni{u, X = E)) = \E\. 
This corresponds to counting the number of occurrences of letters from A x {1} in a word 
over A x {0, 1}. Up to a change of alphabet (thanks to the closure under composition with 
a morphism) it will be sufficient for us that J- contains the function "size" which maps each 
word u G {a, 6}* to \u\a. 

Fact |2.0.4| also gives us a very precise idea of the closure properties we need. We 
need the closure under min and max for disjunctions and conjunctions. For dealing with 
existential and universal quantification, we need the new operations of mi-projection and 
sup-projection. Given a mapping / from A* to NU {oo} and a mapping h from A to B that 
we extend into a morphism from A* to B* (B being another alphabet) the inf-projection 
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of / with respect to h is the mapping finf,h from B* to N U {00} defined for all v G B* by: 

fmf,h{v) = ini{f{u) : h{u) = v} . 

Similarly, the sup-projection of / with respect to h is the mapping /sup,h from B* to NU{oo} 
defined for all v £M* by: 

fsnp,h{^) = sup{/(n) : h{u) = v} . 
We summarise all the requirements in the following fact. 

Fact 2.0.8. Let be a class of cost functions over words such that: 

(1) for all regular languages L, xl belongs to F, 

(2) T contains the cost function "size" , 

(3) J- is effectively closed under composition with a morphism, min, max, inf-projection 
and sup-projection, 

(4) ^ is decidable over 

then the boundedness, divergence and domination problems are decidable for cost monadic 
logic over words. 

The remainder of the paper is devoted to the introduction of the class of recognisable 
cost functions, and showing that this class satisfies all the assumptions of Fact 2.0.8[ In 
particular, Item [T] is established as Example 4.2.2 Item [2] is achieved in Example 4.2.1[ 
Item|3]is the subject of Fact 4.2.3, Corollary 4.5 and Theorems |4. 7| and 4.13[ Finally, Item|4] 
is established in Theorem 14.61 

Thus we deduce our main result. 

Theorem 2.1. The domination relation is decidable for cost-monadic logic over finite 
words. 

All these results are established in Section |4j However, we need first to introduce the 
notion of stabilisation monoids, as well as some of its key properties. This is the subject of 
Section [3l 



3. The algebraic model: stabilisation monoids 

The purpose of this section is to describe the algebraic model of stabilisation monoids. This 
model has, a priori, no relation with the previous section. However, in Section [4j in which 
we define the notion of a recognisable cost function, we will use this model of describing 
cost functions. 

The key idea — an idea directly inspired from the work of Leung, Simon and Kirsten — is 
to develop an algebraic notion (the stabilisation monoid) in which a special operator (called 
the stabilisation, jj) allows to express what happens when we iterate "a lot of times" some 
element. In particular, it says whether we should count or not the number of iterations of 
this element. The terminology "a lot of times" is very vague, and for this reason such a 
formalism cannot describe precisely functions. However, it is perfectly suitable for describing 
cost functions. 

The remaining part of the section is organised as follows. We first introduce the notion 



of stabilisation monoids in Section 3.1, paying a special attention to give it an intuitive 



meaning. In Section 3.2 we introduce the key notions of computations, under-computations 



and over-computations, as well as the two central results of existence of computations 



14 



THOMAS COLCOMBET 



(Theorem 3.3) and "unicity" of their values (Theorem 3.4). These notions and resuhs 
form the main technical core of this work. Then Section 3.4 is devoted to the proof of 



Theorem 3.3, and Section 3.5 to the proof of Theorem 3.4 



3.1. Stabilisation monoids. A semigroup S = {S, ■) is a set S equipped with an associa- 
tive operation A monoid is a semigroup such that the product has a neutral element 1, 
i.e., such that 1 ■ x = x ■ 1 = x for all x G S. Given a semigroup S = (S,-), we ex- 
tend the product to products of arbitrary length by defining vr from to S by 7r(a) = a 
and Tr(ua) = tt{u) ■ a. If the semigroup is a monoid of neutral element 1, we further 
set 7r(e) = 1. All semigroups are monoids, and conversely it is sometimes convenient to 
transform a semigroup S into a monoid simply by the adjunction of a new neutral 
element 1. 

An idempotent in S is an element e £ S such that e • e = e. We denote by E{S) the 
set of idempotents in S. An ordered semigroup {S, ■, <) is a semigroup {S, ■) together with 
an order < over S such that the product • is compatible with <; i.e., a < a' and b < b' 
implies a-b < a' -b' . An ordered monoid is an ordered semigroup, the underlying semigroup 
of which is a monoid. 

We are now ready to introduce the new notions of stabilisation semigroups and stabil- 
isation monoids. 

Definition 3.1. A stabilisation semigroup {S, ■, <, (J) is a finite ordered semigroup {S, ■, <) 
together with an operator jj: E{S) — t- -E'(S) (called the stabilisation) such that: 

• for all e < / in E{S), e« < 

• for all a,be S with a • 6 G E{S) and b ■ a £ E{S), {a ■ b)^ = a ■ {b ■ a)« • bQ 

• for ah e G E{S), < e; 

• for all e e ^(S), (e»)« = eK 

It is called a stabilisation monoid if furthermore {S, •) is a monoid and l" = 1 in which 1 is 
the neutral element of the monoid. 

The intuition is that e^ represents what is the value of e" when n becomes "very large" . 
Some consequences of the definitions, namel}j^ 

for ah e G ^(S), e« = e • = e» • e = e« • = (e')» , 

make perfect sense in this respect: repeating "a lot of e's" is equivalent to seeing one e 
followed by "a lot of e's" , etc. . . This meaning of e" is in some sense a limit behaviour. This 
is an intuitive reason why is not used for non-idempotent elements. Consider for instance 
the element 1 in Z/2Z. Then iterating it yields at even iterations, and 1 at odd ones. 
This alternation prevents to giving a clear meaning to what is the result of "iterating a lot 
of times" 1. 

However, this view is incompatible with the classical view on monoids, in which by 
induction, if e • e = e, then e" = e for all n > 1. The idea in stabilisation monoids is 
that the product is something that cannot be iterated "a lot of times". For this reason, 
considering that for all n > 1, e" = e is correct for "small values of n", but becomes 
"incorrect" for "large values of n" . The value of e"" is e if n is "small" , and it is e" if n is 

^This equation states that (I is a consistent mapping in the sense of |26l 128) . 

^Indeed, e" = (e • 1)" = e • (1 • e)" ■ 1 = e • e" using consistency. In the same way e' = e" • e. Since [j maps 
idempotents to idempotents, e" = e" • e" is obvious, and (e*)" is also by definition. 



REGULAR COST FUNCTIONS, PART I: 



LOGIC AND ALGEBRA OVER WORDS 



15 



"big" . Most of the remainder of the section is devoted to the formalisation of this intuition, 
via the use of the notion of computations. 

Even if the material necessary for working with stabihsation monoids has not been yet 
provided, it is aheady possible to give some examples of stabilisation monoids that are 
constructed from an informal idea of their intended meaning. 

Example 3.1.1. In this example we start from an informal idea of what we would like to 
compute, and construct a stabilisation monoid from it. The explanations have to remain 
informal at this point in the exposition of the theory. However, this example should illustrate 
how we can already reason easily at this level of understanding. 

Imagine you want, among words over a and b, to separate the ones that possess "a lot 
of occurrences of a's" from the ones that have only "a few occurrences of a's", i.e., imagine 
you want to describe a stabilisation monoid that "counts" the number of occurrences of a's. 

For doing this, we should separate three "kinds" of words: 

• the kind of words with no occurrence of a; let b be the corresponding element in the 
stabilisation monoid (since the word b is of this kind), 

• the kind of words with at least one occurrence of a, but only "a few" such occur- 
rences; let a be the corresponding element in the stabilisation monoid (since the 
word a is of this kind) , 

• the kind of words with "a lot of occurrences of a's"; let be the corresponding 
element in the stabilisation monoid. 

The words that we intend to separate — the ones with a lot of a's — are the ones of kind 0. 
With these three elements known, let us complete the definition of the stabilisation monoid. 

Of course, iterating twice, or "many times" , words which contain no occurrences of a 
yields words with no occurrences of the letter a. We capture this with the equalities b = 
b-b = bK 

Now words that have at least one a, but only a few number of occurrences of a, should 
not be affected by appending b letters to their left or to their right. I.e., we set a = b-a = a-b. 
Even more, appending a word with "few a's" to another word with "few a's" does also give 
a word with "few a's". I.e., we set a - a = a. 

However, if we iterate "a lot of times" a word with at least one occurrence of a, it 
yields a word with "a lot of a's". Hence we set a" = 0. We also easily get the equations 
6-0 = 0- 6 = a- = 0- a = 0- = 0'' = Oby inspecting all situations. We reach the following 
description of the stabilisation monoid: 





b 


a 





tt 


b 


b 


a 





5 


a 


a 


a 
























The left part describes the product operation '•', while the rightmost column gives the 
value of stabilisation (J (in general, this column may be partially defined since stabilisation 
is defined only for idempotents). 

To complete the definition, we need to define the ordering over {b,a,0}. The least we 
can do is setting < a and x < x foi all x G {b, a, 0}. This is mandatory, since by definition 
of a stabilisation monoid a" < a, and we have a" = 0. The reader can check that all the 
properties that we expect from a stabilisation monoid are now satisfied. 

The intuition behind the ordering is that depending on what we mean by "a lot of a's" , 
the same word can be of kind a or of kind 0. For instance the word a^^^ is of kind a if we 
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consider that 100 is "a few", while it is of kind if we consider that 100 is "a lot". For this 
reason, there is a form of continuum that allows to go from a to 0. The order < captures 
this relationship between elements. 

It is sometimes convenient to present a stabilisation monoid by a form of Cayley graph: 



b a,b 0,a, 6 




As in a standard Cayley graph, there is an edge labeled by y going from every vertex x 
to vertex x ■ y. Furthermore, there is a double arrow linking every idempotent x to its 
stabilised version xK 

Example 3.1.2. Imagine we want to compute the size of the longest sequence of consecutive 
a's in words over the alphabet {a, b}. Then we would separate four "kinds" of words: 

• the kind consisting only of the empty word; let it be 1, 

• the kind of words, containing only occurrences of a, at least one occurrence of it, 
but only "a few" of them; let the corresponding element be a, 

• the kind of words containing at least one b, but no long sequence of consecutive a's; 
let the corresponding element be b, 

• the kind of words that contain a long sequence of consecutive a's; let the corre- 
sponding element be 0. 

Computing the size of the longest sequence of consecutive a's means identifying the words 
containing a "long" sequence of this type, i.e., it means to separate words of kind from 
words of kind a or 6. 

The table of product and stabilisation is then naturally the following: 
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a 


b 





tt 


1 


1 


a 


b 





1 


a 


a 


a 


b 








b 


b 


b 


b 





b 





















We complete the definition of this stabilisation monoid by defining the ordering. For 
this, we let X < X hold for all x € {1, a, 6, 0}, and we further set < a since a^ = 0. Since 
= • 6, < a and a ■ b = b, we need also to set < 6 for ensuring the compatibility of 
the product with the order. Once more the ordering corresponds to the intuition that there 
exist words that can be of kind a (e.g., the word a^^^) or b (e.g., the word ba^^^), and that 
have kind if we change what we mean by "a lot of" . 

Remark 3.1.3. The notion of stabilisation monoids (or stabilisation semigroups) extends 
the notion of standard monoids (or semigroups). Many standard results concerning monoids 
have natural counterparts in the world of stabilisation monoids. For making this relationship 
more precise, let us describe the canonical way to translate a monoid into a stabilisation 
monoid. Let M = {M, ■) be a monoid. The corresponding stabilisation monoid is: 

Mj = {M,;=,idE^M)) ■ 
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In other words, the monoid is extended with a trivial ordering (the equahty), and the 
stabilisation is simply the identity over idempotents. The reader can easily check that this 
object indeed respects the definition of a stabilisation monoid. 

If we refer to the intuition we gave above, we extend the monoid by an identity sta- 
bilisation. This means that for all idempotents, we do not make the distinction between 
iterating it "a few times" or "a lot of times". Said differently, we never have to count the 
number of occurrences of the idempotents. This is consistent with the principle that a 
standard monoid has no counting capabilities. 

Remark 3.1.4. The order plays an important role, even if it is sometimes hidden. Let 
us first remark that given a stabilisation monoid, it may happen that changing the order 

yields again another valid stabilisation monoid (as for ordered monoids). In general, there 
is a least order such that the structure is a valid stabilisation monoid. It is the intersection 
of all the "valid orders", and can be computed by a least fix-point. However, there is no 
maximal "valid order" in general. 

More interestingly, there exist structures (M, ■,[]) which have no order, which satisfy 
the definition of a stabilisation monoid, excepting for the rules involving the order, and 
such that it is not possible to construct an order for making them a valid stabilisation 
monoid. An example is the 10 elements structure which would be obtained for describing 
the property "there is an even number of small maximal segments of a's" . But this is what 
we want. Indeed, a closer inspection would reveal that this property does not have the 
monotonic behaviour that we could use for defining a function. Consider for instance a 
word of the form a^ba^ba'^b . . . ba^, and assume a small maximal segment of consecutive a 's 
means a segment of length at most m, then, if m takes an even values at most equal to n, 
the word should be considered as in the language, while if it takes an odd value at most 
equal to m, the word should be thought outside the language. Thus, when m ranges in 
the interval {0, . . . , n}, the word is alternatively thought as in the language or outside the 
language. This is typically a non-monotonic behaviour. Keeping in mind cost monadic logic 
from the previous section, we see that no formula would be able to express such a property. 
Requiring an order in the definition of stabilisation monoids rules out such situations. 

We have seen through the above examples how easy it is to work with stabilisation 
monoids at an informal level. An important part of the rest of the section is dedicated 
to providing formal definitions for this informal reasoning. In the above explanations, we 
worked with the imprecise terminology "a few" and "a lot of" . Of course, the value (what 
we referred to as "the kind" in the examples) of a word depends on what is the frontier we 
fix for separating "a few" from "a lot" . 

We continue the description of stabilisation monoids by introducing the key notion of 
computations. These objects describe how to evaluate a long "product" in a stabilisation 
monoid. 

3.2. Computations, under-computations and over-computations. Our goal is now 

to provide a formal meaning for the notion of stabilisation semigroups and stabilisation 
monoids, allowing to avoid terms such as "a lot" or "a few" . More precisely, we develop in 
this section the notion of computations. A computation is a tree which is used as a witness 
that a word evaluates to a given value. 
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We fix ourselves for tlie rest of tfie section a stabilisation semigroup S = (5, •, fj, <). We 



develop first the notion for semigroups, and then see how to use it for monoids in Section 3.3 
(we will see that the notions are in close correspondence). 

Let us consider a word u E (it is a word over S, seen as an alphabet). Our objective 
is to define a "value" for this word. In standard semigroups, the "value" of u is simply ^{u), 
the product of the elements appearing in the word. But, what should the "value" be for a 
stabilisation semigroup? All the informal semantics we have seen so far were based on the 
distinction between "a few" and "a lot" . This means that the value the word has depends 
on what is considered few", and what is considered lot". This is captured by 

the fact that the value is parameterised by a positive integer n which can be understood 
as a threshold separating what is considered as "a few" from what is considered as "a lot" . 
For each choice of n, the word u is subject to have a different value in the stabilisation 
semigroup. 

Let us assume a threshold value n is fixed. We still lack a general mechanism for 
associating to each word u over S a value in S. This is the purpose of computations. 
Computations are proofs (taking the form of a tree) that a word should evaluate to a given 
value. Indeed, in the case of usual semigroups, the fact that a word u evaluates to 7r(n) can 
be witnessed by a binary tree, the leaves of which, read from left to right, yield the word n, 
and such that each inner node is labelled by the product of the label of its children. Clearly, 
the root of such a tree is labelled by vr(n), and the tree can be seen as a proof of correctness 
for this value. 

The notion of n-computation that we define now is a variation around this principle. 
For more ease in its use, it comes in three variants: under-computations, over-computations 
and computations. 

Definition 3.2. An n-under- computation T for the word u = ai . . .ai G is an ordered 
unranked tree with / leaves, each node x of which is labelled by an element v{x) € S called 
the value of x, and such that for all nodes x of children yi, . . . ,yk (read from left to right), 
one of the following cases holds: 

Leaf: k = 0, and v{x) < am where x is the mth leave of T (read from left to right). 

Binary node: k = 2, and v{x) < v{yi) ■ v{y2), 

Idempotent node: 2 < k < n and v{x) < e where e = v{yi) = ■ ■ ■ = v{yk) S E{M), 
Stabilisation node: k > n and v{x) < where e = v{yi) = ■ ■ ■ = v{yk) G E[M). 

An n-over computation is obtained by replacing everywhere "f(x) <" by "t;(x) >". An 
n-computation is obtained by replacing everywhere "f(x) <" by "f(x) =", i.e., a n- 
computation is a tree which is at the same time an n-under computation and an n-over 
computation. 

The value of a [under-/over-]computation is the value of its root. 

We also use the following notations for easily denoting constructions of [under /over] - 
computations. Given a non-leaf iS-labelled tree T, denote by T* the subtree of T rooted at 
the i^^ children of the root. For a in S, we note as a the tree restricted to a single leaf of 
value a. If furthermore Ti, . . . ,Tfc are also S'-labelled trees, then a[Ti, . . . ,Tk] denotes the 
tree of root labelled a, of degree k, and such that T* = Tj for alH = 1 . . . fc. 

It should be immediately clear that these notions have to be manipulated with care, as 
shown by the following example. 

Example 3.2.1. Two examples of computations are given in Figure [TJ Both correspond to 
the stabilisation semigroup (in fact monoid) of Example 3.1.1 the aim of which is to count 
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babbbabbbbbaaa 
A 4-computation of value and height 4 



b b 

babbbabbbbbaaa 
A 4-computation of value a and height 5 



Figure 1: Two 4-computations for the word babbbabbbbbaaa in the monoid of Example 3.1.1 



the number of occurrences of the letter a in a word. Both correspond to the evaluation 
of the same word. Both correspond to the same threshold value n. However, these two 
computations do not have the same value. We will see below how to compare computations 
and overcome this problem. 

There is another problem. Indeed, it is straightforward to construct an n-computation 
for some word, simply by constructing a computation which is a binary tree, and would use 
no idempotent nodes nor stabilisation nodes. However, such a computation would of course 
not be satisfactory since every word u would be evaluated in this way as 7r(ti). We do not 
want that. This would mean that the quantitative aspect contained in the stabilisation has 
been lost. We need to determine what is a relevant computation in order to rule out such 
computations. 

Thus we need to answer the following questions: 

(1) What are the relevant computations? 

(2) Can we construct a relevant n-computation for all words and all n? 

(3) How do we relate the different values that n-computations may have on the same 
word? 

The answer to the first question is that we are only interested in computations of 
small height, meaning of height bounded by some function of the semigroup. With such a 
restriction, it is not possible to use binary trees as computations. However, this choice makes 
the answer to the second question less obvious: does there always exist a computation? 
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Theorem 3.3. For all words u £ and all non-negative integers n, there exists an 
re-computation of height at mosl|^3|/S'|. 

This result is an extension of the forest factorisation theorem of Simon [32] (which 
corresponds to the case of a semigroup). Its proof, which is independent from the rest of 
this work, is presented in Section [3.4[ 




I I n-over-computations 
I I n-under-computations 
I I n-computations 



t> n 



Figure 2: Structure of computations, under-computations, and over-computations. 

The third question remains: how to compare the values of different computations over 
the same word? An answer to this question in its full generality makes use of under- and 
over-computations. 

Theorem 3.4. For all non-negative integers p, there exists a polynomial a : N — t- N such 
that for all n-under-computations over some word of value a and height at most p, and all 
a(re)-over computations of value b over the same word u, 

a<b . 

Remark first that since computations are special instances of under- and over-compu- 
tations. Theorem |3.4| holds in particular for comparing the values of computations. The 
proof of Theorem |3.4| is the subject of Section [33] 

We have illustrated the above results in Figure [2] It depicts the relationship between 
computations in some idealised stabilisation monoid S. In this drawing, assume some word 
over some stabilisation semigroup is fixed, as well as some integer p > 3|S|. We aim at 
representing for each n the possible values of an re-computation, an re-under computation 
or an re-over computation for this word of height at most p. In all the explanations below, 
all computations are supposed to not exceed height p. 

The horizontal axis represents the re-coordinate. The values in the stabilisation semi- 
group being ordered, the vertical axis represents the values in the stabilisation semigroup 
(for the picture, we assume the values in the stabilisation semigroup totally ordered). Thus 
an re-computation (or re-under or re-over-computation) is placed at a point of horizontal 
coordinate re and vertical coordinate the value of the computation. 



When measuring the height of a tree, the convention is that leaves do not count. With this convention, 
substituting a tree for a leaf of another tree results in a tree of height the sum of the heights of the two trees. 
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We can now interpret the properties of the computations in terms of this figure. First 
of ah, under-computations as well as over-computations, and as opposed to computations, 
enjoy certain forms of monotonicity as shown by the fact below. 

Fact 3.4.1. For m < n, all m-under-computations are also n-under-computations, and 
all n-over-computations are also m-over-computations (using the fact that < e for all 
idempotents e). 

Any n-under-computation of value a can be turned into an n-under-computation of 
value b for all b < a (by changing the root label from a to b). Similarly any n-over- 
computation of value a can be turned into an n-over-computation of value b for all b > a. 



Fact 3.4.1 is illustrated by Figure [2j It means that over-computations define a left 



and upward-closed area, while the under-computations define a right and downward-closed 
area. Hence, in particular, the delimiting lines are non-decreasing. Furthermore, since 
computations are at the same-time over-computations and under-computations, the area 
of computations lie inside the intersection of under-computations and over-computations. 



Since the height p is chosen to be at least 3|S|, Theorem 3.3 provides for us even more 
information. Namely, for each value of n, there exists an n-computation. This means in the 
picture that the area of computations crosses every column. However, since computations 
do not enjoy monotonicity properties, the shape of the area of computations can be quite 



complicated. Finally Theorem 3.4 states that the frontier of under-computations and the 
frontier of over-computations are not far one from each other. More precisely, if we choose 
an element a of the stabilisation semigroup, and we draw an horizontal line at altitude a, 
if the frontier of under-computations is above or at a for threshold n, then the frontier 
of over-computations is also above or at a at threshold a{n). Hence the frontier of over- 
computations is always below the one of under-computations, but it essentially grows at 
the same speed, with a delay of at most a. 

Remark 3.4.2. In the case of standard semigroups or monoids (which can be seen as 



stabilisation monoids or semigroups according to Remark 3.1.3), the notions of computa- 



tions, under-computations and over-computations coincide (since the order is trivial), and 
the value of the threshold n becomes also irrelevant. This means that the value of all n- 
[under/over-]computations over a word u coincide with tt{u). (Such computations coincide 
with the "Ramsey factorisations" of the factorisation forest theorem.) 



Let us finally remark that Theorem 3.4, which is a consequence of the axioms of stabil- 
isation semigroups, is also sufficient for deducing them. This is formalised by the following 
proposition. 

Proposition 3. Let S = (S", •, <, jj) be a a structure consisting of a finite set S, a binary 
operation • from 5^ to 5*, < be a partial order, and jj from 5 to be a mapping defined over 
the idempotents of S. Assume furthermore that there exists a such that for all n-under- 
computations for some word u of value a of height at most 3 and all Q!(n)-over-computation 
over u of value b of height at most 3, 

a<b . 

Then S is a stabilisation semigroup. 

Proof. Let us first prove that • is compatible with <. Assume a < a' and b < b' , then 
(a • b)[a, b] is a 0-under-computation over ab and (a' • b')[a' , b'] is an a(0)-over-computation 
over the same word ab. It follows that a ■ b < a' ■ b' . 
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Let US now prove that • is associative. Let a, b, c in n. Then {{a ■ b) ■ c)[{a ■ 6) [a, b], c] is 
a 0-computation for the word abc, and (a • (6 • c))[a, • c)[b, c]] is an a(0)-computation for 
the same word. It follows that (a ■ b) ■ c < a ■ {b ■ c). The other inequality is symmetric. 

2a(0)+2 

Let e be an idempotent. The tree e" [eT^'^^TTe] is both a and a(0)-computation over 

a(0)+l a(0)+l 

the word e^"^'')"'"^. Furthermore, the tree (e^ • e^)[e'*[e, . ?. , e], e^[e, . T. , e]] is also both a 
and an Q;(0)-computation for the same word. It follows that ■ = e^, i.e., that (j maps 
idempotents to idempotents. 

Let us show that e" < e for all idempotents e. The tree e''[e, e, e] is a 2-computation over 
the word eee, and e[e,e,e] is a maa;(3, a(2))-computation over the same word. It follows 
that <e. 

Let us show that stabilisation is compatible with the order. Let e < / be idempotents. 

a(0)+l a(0)+l 

Then [e, . T. , e] and f/, . T. , /] are respectively a 0-computation for the word e"^'^)"''-'^ and 
an a(0)-over-computation for the same word. It follows that e" < 

Let us prove that stabilisation is idempotent. Let e be an idempotent. We already know 
that (e^)" < (this makes sense since we have seen that is idempotent). Let us prove the 

(a(0)+l)2 

opposite inequality. Consider the 0-computation e''[e7r?"~re] for the word e^"^'^)'^^)^, and 

a(0)+l 

, s 

o(0)+l Q!(0)+1 

the a(0)-computation (e^)''[e''[e7~^^Te]i • • • ; e^[e7^'^^Te]] for the same word. It follows that 
e" < {e^)K 

Let us finally prove the consistency of stabilisation. Assume that both a ■ b and b ■ a are 
idempotents. Let tab be (a • (and similarly for tha), i.e., computations for ab and ba 

respectively. Define now: 

a(0)+2 times 
*(a-6)« = (« • b)^ (tab, tab] , 

a(0)+l times 

and ta.(^b-a)»-b = (« • (b-a)^ ■ b)[a, ((6- a)" • b)[{b ■ a)^(tba, ■ ,tba],b]] . 



Then both t^a-b)> and ta.(^b-a)>-b are at the same time and a(0)-computations over the same 
word {ab)°'^'^^~^'^ of height at most 3. Since their respective values are (a • 6)" and a - (b- a)'^ -b, 
it follows by our assumption that (a • 6)" = a • (6 • a)" • b. □ 

This result is particularly useful. Indeed, when constructing a new stabilisation semi- 
group, we usually aim at proving that it "recognises" some function (to be defined in the 
next chapter). It involves proving the hypothesis of Proposition |3j Thanks to Proposi- 
tion [Sj the syntactic correctness is then for free. This situation occurs in particular in 



Section 4.5 and 4.6 when the closure of recognisable cost-functions under inf-projection and 



sup-projection is established. 
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3.3. Specificities of stabilisation monoids. We have presented so far the notion of 
computations in the case of stabihsation semigroups. We are in fact interested in the 
study of stabilisation monoids. Monoids differ from semigroups by the presence of a unit 
element 1. This element is used for modelling the empty word. We present in this section 
the natural variant of the notions of computations for the case of stabilisation monoids. 
As is often the case, results from stabilisation semigroups transfer naturally to stabilisation 
monoids. The definition is highly related to the one for stabilisation semigroups, and we 
see through this section that it is easy to go from the notion for stabilisation monoid to 
the one of stabilisation semigroup case, and backward. The result is that we use the same 
name "computation" for the two notions elsewhere in the paper. 

Definition 3.5. Let M be a stabilisation monoid. Given a word u £ M* , a stabilisation 
monoid n- [under/over] -computation (sm-[under/over]-computation for short) for u is an 
n-[under/over] -computation for some v G M"*", such that it is possible to obtain u from v 
by deleting some occurrences of the letter 1. All have value 1. 

Thus, the definition deals with the implicit presence of arbitrary many copies of the 
empty word (the unit) interleaved with a given word. This definition allows us to work 
in a transparent way with the empty word (this saves us case distinctions in proofs). In 
particular the empty word has an sm-n-computation which is simply 1, of value 1. There 
are many others, like 1, 1, 1[1, 1]] for instance. 

Since each n-computation is also an sm-n-computation over the same word, it is clear 



that Theorem 3.3 can be extended to this situation (just the obvious case of the empty 



word needs to be treated separately): 

Fact 3.5.1. There exists an sm-n-computation for all words in M* of size at most 3|M|. 

The following lemma shows that sm-[under/over]-computations are not more expressive 
than [under/over]-computations. It is also elementary to prove. 

Lemma 3.6. Given an sm-n-computation (resp., sm-n-under-computation, sm-n-over-com- 
putation) of value o for the empty word, then a = 1 {resp. a < 1, a > 1). 

For all non-empty words u and all sm-n-computations T (resp., sm-n-under-computa- 
tions, sm-n-over-computations) for u of value a, there exists an n-computation (resp., n- 
under-computations, n-over-computations) for u of value o. Furthermore, its height is at 
most the height of T. 

Proof. It is simple to eliminate each occurrence of an extra 1 by local modifications of the 
structure of the sm-computation: replace subtrees of the form by 1, subtrees 

of the form a[T, 1] by T, and subtrees of the form a[l,T] by T, up to elimination of all 
occurrences of 1. For the empty word, this results in the first part of the lemma. For non- 
empty words, the resulting simplified sm-computation is a computation. The argument 
works identically for the under/over variants. □ 

A corollary is that Theorem |3.4| extends to sm-computations. 



Corollary 3.7. For all non-negative integers p, there exists a polynomial a : N — )• N such 
that for all sm-n-under-computations over some word u £ M* of value a and height at 
most p, and all sm-a(n)-over computations of value b over the same word u, 

a<b . 
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Proof. Indeed, the sm-under-computations and sm-over-computations can be turned into 



under-computations and over-computations of the same respective values by Lemma 3.6 



The inequahty holds for these under and over-computations by Theorem 3.4, O 

There is a last lemma which is related and will prove useful. 

Lemma 3.8. Let u be a word in M* and v be obtained from u by eliminating some of its 
1 letters, then all n- [under /over] -computations for v can be turned into an n- [under /over] - 
computation for u of same value. Furthermore, the height increase is at most 3. 

Proof. Let v = ai . . . an, then u = ui . . .Un for m £ l*ajl*. Let T be the n- [under /over] - 
computation for v of value a. It is easy to construct an n-[under/over]-computation Ti for 
Ui of height at most 3 of value ai. It is then sufficient to plug in T each Ti for the ith leave 
of T. □ 

The consequence of these results is that we can work with sm-[under/over]-computations 
as with [under-over]-computations. For this reason we shall not distinguish further between 
the two notions unless necessary. 



3.4. Existence of computations: the proof of Theorem 3.3[ In this section, we 



establish Theorem 3.3 which states that for all words u over a stabilisation semigroup S 
and all non-negative integers n, there exists an n-computation for u of height at most 3|5|. 
Remark that the convention in this context is to measure the height of a tree without 
counting the leaves. This result is a form of extension of the factorisation forest theorem 
due to Simon 



Theorem 3.9 (Simon \4:2\ I44j). Define a Ramsey factorisation to be an n-computation in 
the pathological case n = oo (i.e., there are no stabilisation nodes, and idempotent nodes 
are allowed to have arbitrary degree). 

For all non-ernpty words u over a finite semigroup S, there exists a Ramsey factorisation 
for u of heighlHat most 3|S| — 1. 

Some proofs of the factorisation forest theorem can be found in |30l [TJ |8]. Our proof 
could follow similar lines as the above one. Instead of that, we try to reuse as much lemmas 
as possible from the above constructions. 



For proving Theorem 3.3, we will need one of Green's relations, namely the ^/-relation 
(while there are five relations in general). Let us fix ourselves a semigroup S. We denote by 
the semigroup extended (if necessary) with a neutral element 1 (this transforms S into 
a monoid). Given two elements a,b £ S, a <j b ii a = x ■ b ■ y for some x,y £ S^. If a <j b 
and b <j b, then aj'b. We write a <j b to denote a <j b and b a. The interested 
reader can see, e.g., [8| for an introduction to the relations of Green (with a proof of the 
factorisation forest theorem), or monographs such as |3T], [15] or [39] for deep presentations 
of this theory. Finally, let us call a regular element in a semigroup an element a such 
that a ■ X ■ a = a for some x £ S^. 

The next lemma gathers some classical results concerning finite semigroups. 

Lemma 3.10. Given a J^-class J in a finite semigroup, the following facts are equivalent: 
• J contains an idempotent. 



7i 



The exa ct bo und of SIS'! — 1 is due to Kufleitner |30) . It is likely that the same bound could be achieved 



for Theorem 3.3 We prefer here a simpler proof with a bound of 3151, 
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• J contains a regular element, 

• there exist a,b £ J such that a ■ b £ J, 

• all elements in J are regular, 

• all elements in J can be written as e • c for some idempotent e E J, 

• all elements in J can be written as c • e for some idempotent e E J. 
Such j7-classes are called regular. 

We will use the following technical lemma. 

Lemma 3.11. If/ = e- x- e for ejf two idempotents, then e = /. 

Proof. We use some standard results concerning finite semigroups. The interested reader 
can find the necessary material for instance in |39]. Let us just recall that the relations <£, 
<Ti and C and TZ are the one-sided variants of <j and J (£ stands for "left" and TZ for 
"right"). Namely, a <c b {resp. a <ii b) holds if a = x • 6 for some x £ {resp. a = b - x), 
and £ =<c D >c (resp. TZ =<n D >n)- Finally, % = Cr\n. 

The proof is very short. By definition / <£ e since e ■ x ■ e = f . Since by assumption 
fj^e, we obtain fCe (a classical result in finite semigroups). In a symmetric way fTZe. 
Thus fHe. Since an ?^-class contains at most one idempotent, / = e (it is classical than 
any ?^-class, when containing an idempotent, has a group structure; since groups contain 
exactly one idempotent element, this is the only one). □ 

The next lemma shows that the stabilisation operation behaves in a very uniform way 
inside ^/-classes (similar arguments can be found in the works of Leung, Simon and Kirsten). 

Lemma 3.12. If ej^f are idempotents, then e'^J'fK Furthermore, if e = x • / • y for some 
x, y, then e^ = x ■ f'^ ■ y. 

Proof. For the second part, assume e = x ■ f ■ y and ej'f. Let f = {f ■ y ■ e ■ x ■ f). We 
easily check f ■ f = f . Fur therm ore fje = {x ■ f ■ y) ■ e ■ {x ■ f ■ y) <j f <j f . Hence 



f J f . It follows by Lemma 3.11 that /' = /. We now compute e? = {x ■ f ■ f ■ yy = 
x-f-if-x-y-f)^-f-y = x-f-f^-f-y = x-f^-y (using consistency and / = /'). 

This proves that ej^f implies <j fK Using symmetry, we obtain e'^J'fK □ 

Hence, if J is a regular ^/-class, there exists a unique jT'-class j" which contains e" for 
one/all idempotents e E J. If J = J**, then J is called stable, otherwise, it is called unstable. 
The following lemma shows that stabilisation is trivial over stable j7-classes. 

Lemma 3.13. If J is a stable j7-class, then e'^ = e for all idempotents e E J. 



Proof. Indeed, we have e*' = e • e" • e and thus by Lemma 3.11 , e' = e. □ 



The situation is different for unstable j7-classes. In this case, the stabilisation always 
goes down in the J'-order. 

Lemma 3.14. If J is an unstable ^J-class, then <j e for all idempotents e E J. 

Proof. Since e" = e • e", it is always the case that <j e. Assuming J is unstable means 
that e^e" does not hold, which in turn implies e^ <j e. D 
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We say that a word u = ai . . . a„ in is J -smooth, for J a i7-class, if u G J^, and 
7r(u) € J. It is equivalent to say that 7r(ajai+i • • • aj) G J for all 1 < i < j < n. Indeed for 
all 1 < i < j < n, OiJiT^ai . . . an) <j vr(ajai+i ■ ■ • aj) <j Oj € J. Remark that, according 



to Lemma 3.10 if J is irregular, J-smooth words have length at most 1. We will use the 
following lemma from as a black-box. This is an instance of the factorisation forest 
theorem, but restricted to a single J'-class. 

Lemma 3.15 (Lemma 14 in [8J). Given a finite semigroup S, one of its i7-classes J, and a 
J-smooth word u, there exists a Ramsey factorisation for u of height at most 3| J| — 1. 

Remark that Ramsey factorisations and n-computations do only differ on what is al- 
lowed for a node of large degree, i.e., above n. That is why our construction makes use of 



Lemma 3.15 to produce Ramsey factorisations, and then based on the presence of nodes of 



large degree, constructs a computation by gluing pieces of Ramsey factorisations together. 

Lemma 3.16. Let J be a ^-class, n be a J-smooth word, and n be some non- negative 
integer. Then one of the two following items holds: 

(1) there exists an n-computation for u of value tt{u) and height at most 3| J| — 1, or; 

(2) there exists an n-computation for some non-empty prefix of u of value ^ a <j J 
and height at most 3|J|. 



Proof. Remark that if J is irregular, then u has length 1 by Lemma 3.10, and the result 



is straightforward. Remark also that if J is stable, and since the stabilisation is trivial in 



stable ^-classes (Lemma 3.13), every Ramsey factorisations for u of height at most 3| J| — 1 
(which exist by Lemma 3.15) is in fact n-computations for u. 

The case of J unstable remains. Let us say that a node in a factorisation is big if its 
degree is more than n. Our goal is to "correct" the value of big nodes. If there is a Ramsey 
factorisation for u which has no big node, then it can be seen as an n-computation, and 
once more the first conclusion of the lemma holds. 

Otherwise, consider the least non-empty prefix u' of u for which there is a Ramsey fac- 
torisation of height at most 3| J| — 1 which contains a big node. Let F be such a factorisation 
and X be a big node in F which is maximal for the descendant relation (there are no other 
big nodes below). Let F' be the subtree of F rooted in x. This decomposes u' into vv'v" 
where v' is the factor of u' for which F' is a Ramsey factorisation. For this v' , it is easy to 
transform F' into an n-computation T' for v': just replace the label e of the root of F' by 
eK Indeed, since there are no other big nodes in F' than the root, the root is the only place 



which prevents F' from being an n-computation. Remark that from Lemma 3.14, the value 
of F' is <^ J. 

If V is empty, then v' is a prefix of u, and F' an n-computation for it. The second 
conclusion of the lemma holds. 



Otherwise, by the minimality assumption and Lemma 3.15, there exists a Ramsey 
factorisation T for v of height at most 3| J| — 1 which contains no big node. Both T and 
T' being n-computations of height at most 3| J| — 1, it is easy to combine them into an n- 
computation of height at most 3| J| for vv' . This is an n-computation for vv' , which inherits 
from F' the property that its value is <j J. It proves that the second conclusion of the 
lemma holds. □ 



closer inspection would reveal that a £ j". This extra information is useless for our purpose. 
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We are now ready to establish Theorem 3.3 



Proof. The proof is by induction on the size of a left-right-ideal Z <^ S , i.e., ■ Z ■ Z 
(remark that a left-right-ideal is a union of ^/-classes). We establish by induction on the 
size of Z the following induction hypothesis: 

IH: for all words u E Z^ + Z*S there exists an n-computation of height at most 3\Z\ for u. 



Of course, for Z = S, this proves Theorem 3.3 



The base case is when Z is empty, then u has length 1, and a single node tree establish 
the first conclusion of the induction hypothesis (recall that the convention is that the leaves 
do not count in the height, and as a consequence a single node tree has height 0). 

Otherwise, assume Z non-empty. There exists a maximal ^-class J (maximal for <j) 
included in Z. From the maximality assumption, we can check that Z' = Z \ J is again a 
left-right-ideal. Remark also that since Z is a left-right-ideal, it is downward closed for <j. 
This means in particular that every element a such that a <j J belongs to Z' . 

Claim: We claim (*) that for all words u € Z~^ + Z* S, 

(1) either there exists an n-computation of height 3| J| for u, or; 

(2) there exists an n-computation of height at most 3| J| for some non-empty prefix of 
u of value in Z'. 

Let w be the longest J-smooth prefix of ti. If there exists no such non-empty prefix, 
this means that the first letter a of u does not belong to J. Two subcases can happen. 
If u has length 1, this means that u = a, and thus a is an n-computation witnessing the 
first conclusion of (*). Otherwise u has length at least 2, and thus a belongs to Z. Since 
furthermore it does not belong to J, it belongs to Z' . In this case, a is an n-computation 
witnessing the second conclusion of (*). 

Otherwise, according to Lemma [3.16 applied to w, two situations can occur. The first 



case is when there is an n-computation T for w of value Tr{w) and height at most 3| J| — 1. 
There are several sub-cases. If u = w, of course, the n-computation T is a witness that the 
first conclusion of (★) holds. Otherwise, there is a letter a such that wa is a prefix of u. If 
wa = u, then 7r{wa)[T, a] is an n-computation for wa of height at most 3| J|, witnessing that 
the first conclusion of (★) holds. Otherwise, a has to belong to Z (because all letters of u 
have to belong to Z except possibly the last one). But, by maximality of w as a J-smooth 
prefix, either a £ Z' , or TT{wa) £ Z' . Since Z' is a left-right-ideal, a G Z' implies 7r{wa) £ Z' . 
Then, 7r{wa)[T, a] is an n-computation for wa of height at most 3| J| and value 7r(wa) G Z' . 
This time, the second conclusion of (*) holds. 

The second case according to Lemma 3.16| is when there exists a prefix v oi w for which 



there is an n-computation of height at most 3| J| of value <j J. In this case, v is also a 
prefix of u, and the value of this computation is in Z' . Once more the second conclusion of 
(*) holds. This concludes the proof of Claim (*). 

As long as the second conclusion of the claim (★) applied on the word u holds, this 
decomposes u into viu' , and we can proceed with u' . In the end, we obtain that all words 
u £ Z^ + Z* S can be decomposed into ui...Uk such that there exist n-computations 
Ti, . . . , Tfc of height at most 3| J| for ui, . . . ,Uk respectively, and such that the values of 
Ti, . . . , Tk-i all belong to Z' (but not necessarily the value of T^). Let ai, . . . , be the 
values of Ti, . . . , Tfc respectively. The word ai . . . belongs to Z'~^ + Z'*S. Let us apply 
the induction hypothesis to the word ai . . . a^. We obtain an n-computation T for ai . . .a^ 
of height at most 3\Z'\. By simply substituting Ti, . . . , T^, to the leaves of T, we obtain an 
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re-computation for u of height at most 3| J| +3|Z'| = 3\Z\. (Remark once more here that the 
convention is to not count the leaves in the height. Hence the height after a substitution is 
bounded by the sum of the heights.) D 



3.5. Comparing computations: the proof of Theorem 3.4 We now estabhsh the 



second key theorem for computations, namely Theorem 3.4 which states that the result of 
computations is, in some sense, unique. The proof works by a case analysis on the possible 
ways the over-computations and under-computations may overlap. We perform this proof 
for stabilisation monoids, thus using sm-computations. More precisely, all statements take 
as input computations, and output sm-computations, which can be then normalised into 
non-sm computations. The result for stabilisation semigroup can be derived from it. We 
fix ourselves from now on a stabilisation monoid M. 

Lemma 3.17. For all n-over-computations of value a over a word u € M* of length at 
most re, 7r(M) < a. 

Proof. By induction on the height of the over-computation, using the fact that an re-over- 
computation for a word of length at most re cannot contain a stabilisation node. D 

Lemma 3.18. For all re-over-computations of value b over a word hi . . .hk {k > 1) such 
that e < bi for all i, and e is an idempotent, then < b. 

Proof. By induction on the height of the over-computation. O 

A sequence of words ui, . . . ,Uk is called a decomposition of u if u = ui . . .Uk. We say 
that a non-leaf [under /over] -computation T for a word u decomposes u into ui,. . . ,Uk if 
the subtree rooted at the ith child of the root is an [under /over]-computation for Ui, for 
all i = 1 . . . fc. Our proof will mainly make use of over-computations. For this reason, we 
introduce the following terminology. 

We say that a word u G M* n- evaluates to a € M if there exists an re-over-computation 
for u of value a. We will also say that ui, . . . ,Uk re-evaluate to bi, . . . ,bk if Ui re-evaluates 
to bi for all i = 1 . . . k. 

This notion is subject to elementary reasoning such as (a) u re-evaluates to tt{u) or (b) 
if ui, . . . ,Uk re-evaluate to bi, ... ,bk and bi . . .bk n-evaluates to 6, then ui . . .Uk re-evaluates 
to b. 

The core of the proof is contained in the following property: 

Lemma 3.19. There exists a polynomial a such that for all ui, . . . , Ujt G M*, if ui . . . 
a(re)-evaluates to b then ui, . . . ,Uk re-evaluate bi, . . . ,bk, and hi . . .b^ re-evaluates to 5, for 
some hi, . . . ,hk G M. 



From this result, we can deduce Theorem 3.4 as follows. 



Proof of Theorem 3.4 Let a be as in Lemma 3.19 Let Op be the pth composition of a with 



itself. Let U be an re-under-computation of height at most p for some word u of value a, 
and T be an ap(re)-over-computation for u of value h. We want to establish that a <h. The 
proof is by induction on p. 

If p = 0, this means that u has length 1, then T and U are also restricted to a sin- 
gle leaf, and the result obviously holds. Otherwise, U decomposes u into ui, . . . ,Uk. Let 
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oi,. . . ,ak be the values of the children of the root of i, read from left to right. By apply- 
ing Lemma 3.19 on T and the decomposition ui, . . . ,Uk- We construct the ap_i(n)-over- 



computations Bi, . . . , for ui, . . . ,Uk respectively, and of respective values bi, . . . ,bk, as 
well as an ap_ i(n)-over-computation B of value b for 61, . . . , 6^. 

For all i = 1 ... /c, we can apply the induction hypothesis on W (let us recall that W is 
the sub-under-computation rooted at the ith child of the root of U) and Bi, and obtain that 
Ui < bi. Depending on k, three cases have to be separated. If k = 2 (binary node), then 
a < ai • 02 < 61 • 62 < 6. If 3 < A: < n (idempotent node), we have a < ai = ■ ■ ■ = a/. = e 



which is an idempotent. We have e = ai < bi for alH = 1 . . . A;. Hence by Lemma 3.17 e < b, 



which means a < b. If k > n (stabilisation node) , we have once mo re ai = ■ ■ ■ = = e 



which is an idempotent, and such that a < eK This time, by Lemma 3.18, we have e" < b. 



We obtain once more a < b. □ 



The remainder of this section is dedicated to the proof of Lemma 3.19 



Lemma 3.20. There exists a positive integer K such that for all idempotents e, /, whenever 
f < Ui-bi for all i = 1 . . . k, k > K, and bi • Oj+i < e for i = 1 . . . k — 1, then < ai • e" • 6^. 

Proof. To each ordered pair i < j, let us associates the color Cij = (aj, 7r(6jaj+i . . . fcj-i)). 
We now apply the theorem of Ramsey to this coloring, for K sufficiently large, and get that 
there exist l<i<s<j<k such that Ci^s = Csj =Ci j= {o;b). This implies in particular 
that b- a-b = b, and thus a ■ b and b ■ a are idempotents. Furthermore, f < a-b and b- a < e. 
It follows from consistency that < (a-b)^ < a - [b - a)" ■ b < a ■ ■ b = ai ■ ■ ^{bs . . . bj-i). 
We now have, using the assumptions that f < • bh and b^ • ah+i < e, 

f'^ = f ■ f'^ ■ f < vr(ai . . . Oj) • e" • 7r(6s . . .bk) < ai ■ e ■ e'^ ■ e ■ bk = ai ■ ■ bk ■ 

□ 

The following lemma will be used for treating the case of idempotent and stabilisation 



nodes in the proof of Lemma 3.19 



Lemma 3.21. There exists a polynomial P such that, if xi,yi, X2, 1/2 ... , Xm, Um > 1) 
are elements of M and e is an idempotent such that x^ ■ yh < e for all /i = 1 . . . m, then 
(yi ■ X2){y2 • X3) ■ ■ ■ {ym-i ■ Xm) n-evaluates to z such that: 

• xi - z ■ym<e, and; 

• if m > /3(n) or {xh ■ yh) < for some h = 1 . . .m then xi ■ z ■ ym < eK 

Proof. Let us treat first the case m < /3{n), whatever is /3. Remark first that {yi ■ 
X2) ■ ■ ■ {Vm-i ■ Xm) naturally n-evaluates to z = TT{yiX2 ■ ■ ■ ym—lXm ). Thus xi ■ z ■ ym = 

1T{xiyi ■ ■ ■ XmUm) = 

Assume now that m > /3(n) for /3(?i) = (n+i^)^l^l + 1 where K is the constant obtained 



from Lemma 3.20 Set di = {yi ■ Xj+i) for all z = 1 ... m — 1. 

We first claim (*) that there exists i < j such that v = di...dj-i n-evaluates to 
yi-e^ ■ Xj. For this, consider the word u = di . . . dm-i, and apply Theorem 3.3 for producing 



an (n + i^)-computation U for u of height at most 3|M|. The word u has length m — 1 > 
P{n) - 1 = (n + ir)3|^-^l. Thus there is a stabilisation node in T, say of degree k > n + K . 
Let 5 be a subtree of T rooted at some stabilisation node. Let / be the (idempotent) 
value of the children of this node, the value of S being fK This subtree corresponds to 
the factor v = di . . -dj-i of u. We have to show that v n-evaluates to yi ■ e'^ ■ xj. The 
computation S decomposes v into vi, . . . ,Vk and each is of the form di^^ . . . di^^-^-i for 
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some h = 1 . . .k with i = ii < ■ ■ ■ < ik+i = j- Define now ah to be y^,^ and bh to be 
7r(xi^+iyj^+i . . . for all /i = 1 . . . k. It is clear that / < ir{vh) = at ■ bh for 

all /i = 1 ... A; since there is a computation over Vh of value /. Furthermore, bh ■ dh+i < e 
for all /i = 1 . . . — 1. Hence we can apply Lemma 3.20, and get that /" < ai • e" • 5^. 



Since furthermore ai = y-i, and b^ is either < Xj or < e • Xj, it follows that v n-evaluates to 
f'^^Ui'^'^' Xj . This concludes the proof of the claim (*) . 
Set now 

z = 7r(yi . . . yj) • • 7r(xj . . . y^) ■ 
Since, using the claim (*), di . . . dj-i, . . . cfj-i, dj . . . dm-i n-evaluate to ir{di . . . 
Ui-e^ ■ Xj, TT{dj . . . dm~i), we get that di . . . dm-i n-evaluates to z. Furthermore, xi- z-ym = 
■ yi) ' ' T^ixj . . . ym) — c". This proves the second conclusion of the statement. CH 

We are now ready to conclude. 

Proof of Lemma 3.1S{ Let us set a{n) to be (n + l)f3{n) — 1, where /3 is the polynomial 
taken from Lemma 3.21 Lemma |3.19| follows from the following induction hypothesis: 

Induction hypothesis: For all words u which Q(n)-evaluate to b, and all decompositions of 
u into Ml, . . . , Ufc (A; > 2), then ui, . . . ,Uk n-evaluate to 6i, . . . , bk, and 62 • ■ ■ bfc-i n-evaluate 
to c such that 61 • c • 6^ < 6. 

Induction parameter: The height of the a(n)-over-computation T witnessing that a(n)- 
evaluates to b. 



It should be clear that this implies Lemma 3.19 since this means that bi . . .bk n-evaluate 
io bi ■ c ■ bk <b. 

The essential idea in the proof of the induction hypothesis is that T decomposes the 
word into ui, . . . and we have to study all the possible ways the ViS and the n^-'s may 
overlap. In practice, we will not refer much to T, but simply about how it decomposes the 
word into f 1, . . . , u^. Thus, from now on, let vi, . . . and ui, . . . , Ufc be decompositions of 
a word u such that each of the Wj's a(n)-evaluates to Oj and is subject to the application of 
the induction hypothesis. 

Leaves. This means that i = 1. All the u^s should be empty, but one, say Uh = a where 
a is the letter labelling the leaf. Three cases can occur depending on h. If /i = 1, then 
ui, . . . , Ufc obviously n-evaluate to a, 1, . . . , 1, 1, and 1 ... 1 n-evaluate to 1, and we indeed 
have a • 1 • 1 < o. The case h = k is symmetric. Finally, if 1 < /i < /c, then ui, . . . ,Uk 
n-evaluate to 1, . . . , 1, a, 1, . . . , 1, and 1 . . . lal ... 1 n-evaluate to a, and we indeed have 
1 • a • 1 < a. 

Binary nodes. If £ = 2, then there exist s in 1, . . . ,k and words w, w' such that 

vi = ui . . . Us-iw , Us = ww' , and V2 = w'ug+i ■ ■ .Uk . 

We can apply the induction hypothesis to both vi and V2. We obtain that ui, . . . , Ug^i^w, 
w' , Us+i, . . . ,Uk n-evaluate to bi, . . . , bg^i, d, d' , bg+i, . . . ,bk, and that 62 • • ■ bs-i, bg+i . ■ - bk 
n-evaluate to ci, C2 such that bi-ci-d < ai and d'-C2-bk < a2- It follows that Us n-evaluates to 
bs = d-d' . Furthermore, 62 • • • ^fe-i n-evaluates to ci-bs-C2 = c. Overall, ui, . . . ,Uk n-evaluate 
to 61, . . . , 6fc and 62 . . . n-evaluates to c such that bi-c-bk = {bi-ci-d)-{d' ■d2-bk) < 01-02. 
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Idempotent and stabilisation nodes. Assume now that vi, . . . ,vi a(n)-evaluate to e, . . . , e, 
where e is idempotent. We aim at proving that ui, . . . ,Uk n-evaluates to bi, . . . ,bk, and 
62 • • • bk-i to c such that 61 • c • 6^ < e, and if £ > a{n), bi ■ c ■ b^ < eK 

We rely on a suitable decomposition of the words: there exist = io < ii < ■ ■ ■ < im < 
im+i = i + 1 and 1 = jo <•••< im = as well as words e = u'q, Uq, u[, u'(, 
such that 



u' u" = £ 



Vi 



II 



and 



Ui 



II 



for all h 
for all h 



m. 



. . . m. 



The best is to present it through a drawing. It is annotated with all the variables that will 
be used during the proof. The two main rows represent the two possible decompositions of 
the word into fj's and Uj's. 







Ch 

/ ^ 







61' 



Such a decomposition is not unique. It is sufficient to guarantee that each separation 
between some Us and some Ug+i fall in some f , and that each Vi^ contains such a separation. 
We can apply the induction hypothesis on each equation (^). Hence, it follows that 

.i,6;and6j^_,+i...6j^_i n- 



h-i> ""ih-i+i) 



, Mj^-i, u'^ n-evaluate to 



evaluates to c/j, such that ■ Ch ■ b'^ < e. Set furthermore 69 = b'^ 



n-valuate to b'^^, 6'^ for all h 



eh 



. . .m. Define furthermore for all h 
1 



1. We get that 
: . . . m, e/i as 



1 if ih+i - i/i - 1 = 
e if 1 < ih+i -ih-'^<n 
if i/j+i - i/i - 1 > n 

Since each Vh a(n)-evauates to e, each Vh also n-evaluates to e. Now Ch has been chosen such 



that Uj^+i 



-1 n-evaluates to Ch- Thus from tXj^ n-evaluates for all /i = . , 



to that we define as 



6'^ • e/i • 6'^. At this point, we have that 
CI Ml, . . . , Ufc n-evaluate to bi, . . . ,bk- 

To head toward the conclusion, we will use Lemma 3.21 Thus, let us set Xh to be b'^_^ 
and Uh to be c/j • b'^ • e/j for all /i = 1 . . . m. We have 

(K-i • Ch • ■eh = e-eh. (t) 
' 'to 



2^/1 • Vh 



According to (f), Xh- Vh < e, and we can apply Lemma 3.21 

^1 5 2/1 5 3^2 ) • • • ) 2^771, Um 

and obtain that (yi • X2) ■ ■ ■ {Vm-i • Xm) n-evaluates to some z subject to the conclusions of 
the lemma (we will recall these conclusions upon need). 
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Let us now establish the following claims C2, C3 and C4. 
C2 62 • • • bk-i n-evaluates to (z • Cm)- 



Indeed, for all h 



1 . . .m, Ch is chosen such that bj^-^^i . . . bj^-i n-evaluates to c/i, thus 



bi^ ,4-1 •• • b-i^ n-evaluates to: 



Ch 



Ch ■ b'h ■ Ch ■ bl = Vh ■ Xh+i , 



by just unfolding the definitions. Since furthermore (yi • X2) ■ ■ ■ {Um-i ■ Xm) n-evaluates to z, 



it follows that 62 . . . bj^_-^ n-evaluates to z. Furthermore, by choice of Cn 



1+1 



...bk- 



Thus 62 • • • bk-i n-evaluates to 
) -bk <e. 



Cm) as claimed. 



n-evaluates to Cm- 
C3 bi- {z ■ Cm 

Indeed, according to the conclusions of Lemma 3.21, xi ■ z ■ ym < e. Hence, 

bi- {z- Cm) ■ bk = Co ■ xi ■ z ■ ym < Co ■ e < e . 
(n -|- l)/3(n) — 1 then bi ■ {z ■ Cm) ■ bk < (with /3 taken from 



C4 if £ > a(n) 
Lemma 3.21 ). 
Since io = and i 



m+l 



£ -|- 1 , we have 



m + '^iih+i 

h=0 



ih-1) 



0. 



1 > n for some 
, m. In all cases. 



Since £ > (n + l)/3(n) — 1 this means that either m > /?(n), or ih+i 
h = . . .m. This means that either m > /3(n), or Ch = e^ for some h 

bi ■ {z ■ Cm) ■ bk = Co ■ xi ■ z ■ ym < e"^ . 

The last inequality can have three origins. Either m > f3{n) or Ch = c^ (recall (f) st ating 



that Xh ■ yh ^ c ■ Ch) for some /i = 1 . . . m, or eo = In the two first cases, by Lemma 3.21 
xi ■ z ■ ym < e", and thus cq • a^i • z • < e" (since cq is either 1, or e, or e"). In the third 
case, Co • xi • z • ym < e" since xi ■ z ■ ym < e. 

Gathering the claims CI, C2, C3, we get that ui, . . . ,Uk n-evaluate to 61, ... , bk, that 
62 . . . bk-i n-evaluates to c — z • Cm and that bi ■ c • bk c. This is exactly the induction 
hypothesis for the idempotent node case. If we further gather C4, we get that if the root 
node of T is a stabilisation node, bi ■ c ■ bk < e. Once more the induction hypothesis is 
satisfied. □ 



4. Recognisable cost functions 

We have seen in the previous sections the notion of stabilisation monoids, as well as the 
key technical tools for dealing with them, namely computations, over-computations and 
under-computations. In particular, we have seen Theorem 3.3 and Theorem 3.4 that state 
the existence of computations and the "unicity" of their values. In this section, we put 
these notions in action, and introduce the definition of recognisable cost functions. We will 
see in particular that the hypothesis of Fact |2.0.8 is fulfilled by recognisable cost functions, 
and as a consequence the domination problem for cost-monadic logic is decidable over finite 
words. 
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4.1. Recognisable cost functions. Let us fix a stabilisation monoid M. An ideal of M 
is a subset I of M whicli is downward closed, i.e., such tliat whenever a < b and 6 G I we 
have a & I. Given a subset X C M, we denote by the least ideal which contains X, 
i.e., Xl = {y : y < x, x G X}. 

The three ingredients used for recognising a cost functions are a monoid M, a mapping h 
from letters of the alphabet A to M (that we extend into a morphism h from A* to M*), 
and an ideal /. 

We then define for each non-negative integer p four functions from A* to N U {oo}: 
|M, h, /]p~(ti) = inf{n : there exists an n-under-computation of value in M \ / 

of height at most p for h(u) } 
|M, h, Ijpiu) = inf{n : there exists an n-computation of value in M \ / 

of height at most p for h{u) } 
|M, h, I}p{u) = sup{n + 1 : there exists an n-computation of value in I 

of height at most p for h{u) } 
|M, h, I'lp^{u) = sup{n -|- 1 : there exists an n-over-computation of value in / 

of height at most p for h{u) } 
These four functions (for each p) are candidates to be recognised by M, h, I. The following 
lemma shows that if p is sufficiently large, all the four functions belong to the same cost 
function. 

Lemma 4.1. For all p > 3\M\, 

|M,/i,/l;- < [M,/i,/]; < lM,h,I]; < IM,/i,Il++ . 
For all p, there exists a polynomial a such that 

[M,/i,Il++ <ao|M,/i,/l;- . 

For all p < r, 

[M, h, Ij;- < [M, h, Ij-- , and [M, h, /!++ < [M, h, /]++ . 

Proof. All the inequalities are direct consequences of the results of the previous section. 

The middle inequality in the first equation simply comes from the fact that, thanks 
to Theorem |3.3[ the union of the set of integers over which range the infimum in the 
definition of [M, h, /]+ with the set of integers over which range the supremum in the 
definition of |M, h, I}~ is N. This suffices for justifying the middle inequality. The first and 
last inequalities of the equation simply come from the fact that each n-computation is in 
particular an n-under-computation and an n-over-computation respectively. 



The second line is directly deduced from Theorem 3.4 

The third statement simply comes from the the fact that each n-under-computation of 
height at most p is also an n-under-computation of height r (left inequality). The same 
goes for over-computations. □ 

The main consequence of this lemma takes the form of a definition. 

Definition 4.2. For all stabilisation monoids M, all mappings from an alphabet A to M, 
and all ideals /, there exists a unique cost function [M, h, /] from A* to N U {oo} such 
that for all p > 3\M\, all the functions [M, h, Ij--, |M, h, Ij^, |M, h, Ij+ and |M, h, 
belong to [M, h, IJ. It is called the cost function recognised by M, h, I. One sometimes also 
writes that it is recognised by M. 
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Example 4.2.1. The cost function | • |a is recognised by M, {o i— )• a, 6 i— )• b}, {0}, where M is 



the stabihsation monoid of Example 3.1.1 In particular, the informal reasoning developed 
in the example such as "a few+a few=a few" now has a formal meaning: the imprecision 
in such arguments is absorbed in the equivalence up to a of computation trees, and results 
in the fact that the monoid does not define a unique function, but instead directly a cost 
function. 

Another example is the case of standard regular languages. 

Example 4.2.2. Let us recall that a monoid M together with h from A to M and a subset 
F C M is said to recognise a language L over A if for all words u, u G L if and only 

as a stabilisation 



if 7T{h{u)) G F. The same monoid can be seen, thanks to Remark 



3.1.3 



monoid. In this case, thanks to Remark 3.4.2, the same M, h, F recognises the characteristic 
mapping of L. 

An elementary result is also the closure under composition with a morphism. 

Fact 4.2.3. Let M,/i,/ recognise a cost function / over A*, and let z he a mapping from 
another alphabet B to A, then M, ho z,I recognises f o z. 

Proof. One easily checks that the computations involved in the definition of JM, h, /]"'"(z(m)) 
are exactly the same as the one involved in the definition of [M, ho z, /]^(n). □ 

We continue this section by developing other tools for analysing the recognisable cost 
functions. 



4.2. The (j-expressions. We now present the notion of jj-expressions. This provides a 
convenient notation in several situations. This object was introduced by Hashiguchi for 
studying distance automata [17j. The ft-expressions can be seen in two different ways. On 
one side, a [j-expression allows to denote an element in a stabilisation monoid. On the 
other side, a jj-expression denotes an infinite sequence of words. Such sequences are used 
as witnesses, e.g., of the non-existence of a bound for a function (if the function tends 
toward infinity over this sequence), or of the non-divergence of a function / (if the function 
is bounded over the sequence). More generally, ft-expressions will be used as witnesses of 
non-domination. 

In a finite monoid M, given an element a G M, one denotes by the only idempotent 
which is a power of a. This element does not exist in general for infinite monoids, while 
it always does for finite monoids (our case). Furthermore, when it exists, it is unique. In 
particular in a finite monoid M, a'^ = a^, where Q is some multiple of |M|!. This is a useful 
notion since the operator of stabilisation is only defined for idempotents. In a stabilisation 
monoid, let us denote by a'^^ the element (a^)". As opposed to a" which is not defined if a 
is not idempotent, a"^" is always defined. We consider as fixed from now. 

A '^-expression over a set A is an expression composed of letters from A, products, and 
exponents with wfj- A fj-expression E over a stabilisation monoid denotes a computation in 
this stabilisation monoid. It naturally evaluates to an element of E, denoted value(E), and 
called the value of E. A (J-expression is called strict if it contains at least one occurrence 
of wjj. 

Given a set A C M, call (^4)'' the set of values of expressions over A. Equivalently, it is 
the least set which contains A and is closed under product and stabilisation of idempotents. 
One also denotes (A)'^'^ the set of values of strict ft-expressions over A. 
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The n-unfolding of a (j-expression over A is a word in A'^ defined inductively as follows: 
unfold(a, n) = a 
uniold(EF, n) = unfold(£^, n)unfold(F, n) 

n times 

unfold(^'^^ n) = unfold(E, n) . . . unfold(^, n) 

We conclude this section by showing how [j-expressions can be used as witnesses of the 
behaviour of a regular cost function. 

Proposition 4. Assume M, h, I recognises /, and let -E be a (j-expression over A of value a, 
then: 

• if a G / then {f(unfo\d{E,Qn)) : n > 1} tends toward infinity, 

• if a I then {/(unfold(£', rin)) : n > 1} is bounded. 

Proof. We need, given a positive integer n, to produce an n-computation of value a. It is 
defined as follows: 

computation(a, n) = a 

computation(£'i<', n) = ?;a/iie(i?F) [computation(ii^, n), computation(F, n)] 

n times 

. ^ s 

computation(i?'^'', n) = value{E'^'^)[computation{E^ , n) , . . . , computation(ii^^, n)] 
computation(£'^, n) = computation(£', n) 

computation(£''^, n) = ua/Me(i?'^)[computation(£', n), computation(£^'^^^, n)] (for A; > 1) 

It is easy to check that computation (S, n) is an n-computation for unfold(u, f^n), that its 
value is value{E) and that its height H depends only upon E. What we have is in fact 
stronger: computation(£^, n) is an m-computation for unfold(£', On) for all m < n. 

• Let us suppose a £ I. We have: 

[M, h, /]^ = sup{m + 1 : there is an m-computation for h{nnfo\d{E , Qn)) of value in 1} 
> n . (using computation(£', n) as a witness) 

Thus /(unfold(£', On)) tends toward infinity when n tends to infinity. 

• Suppose now that if a /, one obtains: 

|M, h, Ij^Jj = inf{m : there is an m-computation for /i(unfold(£'. On)) of value in M \ /} 
= . (using computation(£^, n) as a witness) 

Thus {/(unfold(£', n)) : n G N} is bounded. 

□ 



4.3. Quotients, sub-stabilisation monoids and products. We introduce here some 
useful definitions for manipulating and composing stabilisation monoids. We use them for 
proving the closure of recognisable cost functions under min and max (Corollary 4.5). 

Let M = (M, •, jj, <) and M' = (M', •', (]', <') be stabilisation monoids. A morphism of 
stabilisation monoids from M to M' is a mapping /i from M to M' such that 

• /^(1m) = 1m', 

• /i(x) •' iJ,{y) = fi{x • y) for all x, y in M, 
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• for all X, y in M, if x < y then fj,{x) <' fj,{y), 

• n{e/ = fi{e^) for all e e E{M) (in this case, n{e) E E{M')). 

Remark 4.2.4. The n-computations {resp., n-under-computations, n-over-computations) 
over M are transformed by morphism (applied to each node of the tree) into n-computations 
{resp., n-under-computations, n-over-computations) over M'. In a similar way, the image 
under morphism of a (t-expression over M is a (j-expression over M'. 

We immediately obtain: 

Lemma 4.3. For /i a morphism of stabilisation monoids from M to M', h a mapping from 
an alphabet A to M and /' an ideal of M', we have: 

lM,h,f,-Hnj = lM',^loh,I'j . 

Proof. Let us remark first that / = fi^^{I') is an ideal of M. 

Let M be a word in A*. Let us consider an n-computation over M for h{u) of value a G 
fi~^{I'). This computation can be transformed by morphism into an n-computation over M' 
for {fj, o h){u) of value ^(a) G /'. In a similar way, each n-computation over h{u) of 
value a £ M\^i-^{r) can be transformed into an n-computation of value /i(a) G M' \ I' . O 

The notion of morphism is intimately related to the notion of product. Given two 
stabilisation monoids M = (M, •,{!,<) and M' = {M' , ■' ,\\' , <') , one defines their product 
by: 

M X M' = (Af X M', ■", tt", <") 

where {x,x') ■" {y,y') = (x ■ y,x' ■' y'), {e,e')^" = {e^,e''^ ), and {x,x') <" {y,y') if and only 
if X < y and x' <' y' . 

As expected, the projection over the first component {resp., second component) is a 
morphism of stabilisation monoids from M x M' onto M {resp., onto M'). It follows by 
Lemma 4.3 that if / is recognised by M, h, I and g by M', h', I' , then / is also recognised 
by M X M', hxh',I X M' and ghyMx M', hxh',M x I', in which one sets {h x h'){a) = 
{h{a), h{a')) for all letters a. Thus one obtains: 

Lemma 4.4. If / and g are recognisable cost functions over A*, there exist a stabilisation 
monoid M, an application h from A to M and two ideals /, J such that M, h, I recognises / 
and 'M.,h,J recognises g. 

Corollary 4.5. If / and g are recognisable, then so are max(/, g') and mm{f,g). 



max ■ 



Proof. According to Lemma 4.4 one assumes / recognised by M, /i,/ and g by M, /i, J. 
Then (for a height fixed to at most p = 3|M|) one has: 

max(lM,/i,Il + ,lM,/i, Jl+)(n) 

sup{n + 1 : there is an n-computation for h{u) of value in /} 
sup{n + 1 : there is an n-computation for h{u) of value in J} 

= sup{n + 1 : there exists an n-computation over h{u) of value in / U J} 

= iM,h,iujj; . 

Thus max(/, g) is recognised by M, h,I U J. In a similar way, min(/, g) is recognised 
by M, /i, / n J. □ 
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4.4. Decidability of the domination relation. We are now ready to estabhsh the de- 
cidability of the domination relation. 

Theorem 4.6. The domination relation (=^) is decidable over recognisable cost functions. 



Proof. Let f,g be recognisable cost functions. According to Lemma 4.4 there exist a sta- 
bilisation monoid M, a mapping h from the alphabet A to M and two ideals /, J such 
that M, h, I recognises / and M, h, J recognises g. 

We show that / dominates g if and only if the following (decidable) property holds: 

(/i(A))«n/c J . 

First direction. Let us suppose {h{A))^ n / C J. Of course, every n-computation over h{u) 
for a word u over h{A) has its value in {h{A))K It follows that (for heights at most 3|M|): 

|M, h, /]^(ti) = sup{n -|- 1 : there is an n-computation for h{u) of value in /} 

< sup{n -|- 1 : there is an n-computation for h{u) of value in J} 

= [M,/i,J]+(n) . 

Thus we have f ^ g. 

Second direction. Let us suppose the existence of a € {h{A))'^nI\J . By definition of {h{A))'^, 
there is a (j-expression E over h{A) of value a. Let F be the jj-expression over A obtained by 
substituting to each element x G h{A) some letter from c G A such that h{c) = x. According 
to Proposition [4j / is unbounded over {unfold{F,Qn) : n > 1} (for some suitable k). 
However, still applying Proposition |4| g is bounded over {unfold(F, $7n) : n > 3}. This 
witnesses that g does not dominate /. □ 



4.5. Closure under inf-projection. We establish the following theorem. 
Theorem 4.7. Recognisable cost functions are effectively closed under inf-projection. 

The projection in the classical case (of recognisable languages of finite words) requires 
a powerset construction (for monoids as for deterministic automata). In our case, the 
approach is similar. Let z be a mapping from alphabet A to B. The goal of a stabilisation 
monoid which would recognise the inf-projection by z of a recognisable cost function is to 
keep track of all the values a computation could have taken for some inverse image by z of 
the input word. Hence an element of the stabilisation monoid for the inf-projection of the 
cost function consists naturally of a set of elements in the original monoid. 

A closer inspection reveals that it is possible to close these subsets downward, i.e., 
to consider only ideals. In fact, it is not only possible, but it is even necessary for the 
construction to go through. Let us describe more formally this construction. 

We have to consider a construction of ideals. Let be the set of ideals of M. One 
equips of an order simply by inclusion: 

I <J if / C J , 

and of a product as follows: 

A-B = {a-b : aeA, b£ B}i . 
Finally, the stabilisation is defined for an idempotent by: 

= {E)^+i . 
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The resulting structure (M|^, •, <, tJ) is denoted M^^. 

It may seem a priori that our first goal would be to prove that the structure defined in 
the above way is indeed a stabilisation monoid. In fact, thanks to Proposition [3j this will 



be for free (see Lemma 4.12 below). 

We now prove that Mj^ can be used for recognising the inf-projection of a cost function 
recognised by M. Thus our goal is to relate the (under)-computations in Mj^ to the (under)- 
computations in M. This will provide a semantic link between the two stabilisation monoids. 



This relationship takes the form of Lemmas 4.10 and 4.11 below. 



Let us first state a simple remark on the structure of idempotents. 

Lemma 4.8. If E is an idempotent in M^, then for all a G ii^ there exist b,c,eGE with e 
idempotent such that a < b ■ e ■ c. 

n times 

Proof. As E = E ■ ■ ■ E, iov all n > 1, there exist ai, . . . , a„ G E such that a < ai • • • an- 
Hence using Ramsey's theorem, for n sufficiently large, there exist 1 < i < j < n such 
that Oj • • • aj = e is an idempotent. One sets b = ai • • • aj_i, c = aj+i ■ ■ ■ an- We have 
6, c, e G E, and a <b ■ e ■ c- □ 

A similar characterisation holds for the stabilisation of idempotents. 

Lemma 4.9. If is a stable idempotent in iW^ (i.e. such that E = i?"), then for all a G 
there exist b,c,e^E with e idempotent such that a < 6 • e" • c. 

Proof. By definition of E^, there is a strict (j-expression F over E such that a < value{F). 
Thus it is sufficient to prove, by induction, that all strict ft-expressions F is such that 
value{F) < b ■ e^^ ■ c for some b, e, c in E. The base case is F = G^" where G is a non strict 
Ij-expression. In this case value{G) = g ^ E. It follows that value{G^^) = value{G)^^ < 
gi^ . g'^i . g'^ ^ Thus thc iuductiou hypothesis holds. The other case is the product F = GH. 
By induction hypothesis, value{G) < a ■ e'^^ ■ b and value{H) < a' ■ f^^ ■ b' . It follows that 
value{F) < a ■ e^'^ ■ d with d = a' - f^^ - b' ^ E. Once more the induction hypothesis hold. D 

Lemma 4.10. Let Ai . . . be a word over and let T be an n-under-computation 
over Ai . . . Ai. of height at most p and of value A. For all a £ A, there exists an n-under- 
computation of height 3p and value a for some word oi . . . such that oi G ^i,. . . ,Ofc G Af^. 

Proof. The proof is by induction on p. 

Leaf case, i.e., T = Ai. Let a €z A Ai, then a is an n-computation of value a £ A. 

Binary node, i.e., T = A[Ti,T2]. Let Bi and B2 be the respective values of Ti and T2. 
Let a £ A O Bi - B2- By definition of the product, there exists 61 G Bi and 62 £ B2 
such that a < 61 • 62. By induction hypothesis, there exist n-under-computations ti and t2 
of respective values 61 and 62- The n-under-computation a[ti,t2] satisfies the induction 
hypothesis. 

Idempotent node- T = F[Ti, . . . ,Tf:] for some k < n where F <^ E for an idempotent 
E such that the value of T,- is E for all i. Let a G F C E. We have a < b ■ e ■ c 



for some b,c,e G E (Lemma 4.8). We then apply the induction hypothesis for b,e,. . . ,e 
and c on the n-under-computations Ti, . . . ,Tfc_i and respectively, yielding the n-under- 
computations ti, . . . , tk-i and tfc respectively. The tree a[ti, (e • c)[e[t2, . . ■ , tk-i],tk]] is an 
n-under-computation witnessing that the induction hypothesis holds. 
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Stabilisation node. T = F[Ti,...,Tk] for some k > n and F (1 for some idempo- 
tent E such that the value of T,- is E for all i. Let a ^ F d EK We have a < 



b ■ ■ c for some b,e,c £ E (Lemma 4.9). We then apply the induction hypothesis 



for b,e, . . . ,e and c respectively and the computations Ti, . . . , respectively, yielding the 
n-under-computations ti, . . . ,tk respectively. We conclude by constructing the n-under- 
computation a[ti, (e" • c)[e''[t2, • ■ • , tk~i],tk]] (remark that e^t2, ■ ■ ■ , tfc-i] is a valid under- 
computations since < e). □ 

Lemma 4.11. There exists a polynomial a such that for all words Ai . . . over and all 
a(n)-over-computation T for Ai . . . A^ of height p and value A, and all ai G Ai, . . . , G A^, 
there exists an n-computation over ai . . . of value a £ A and of height at most 3|M|p. 

Proof. The proof is by induction on p. Set a{n) = n^'^^l for all n. 

Leaf case, i.e., T = A and u = ai £ Ai C A. Hence ai is a computation satisfying the 
induction hypothesis. 

Binary node, i.e., T = A[Ti,T2] where Ti and T2 have respective values Bi and B2 such that 
Bi ■ B2 ^ A. One applies the induction hypothesis on Ti and T2, and gets computations ti 
and t2, of respective values bi G Bi and 62 £ ^2- The induction hypothesis is then fulfilled 
with the n-computation (61 • b2)[ti,t2] of value 61 • 62 S ^• 

Idempotent node, i.e., T = F[Ti, . . . ,Tk] for k < n^\^^\ where Ti,...,Tk share the same 
idempotent value EOF. Let ti, . . . ,tk he the n-computations of respective values bi, . . . ,bk 
obtained by applying the induction hypothesis on Ti,...,Tk respectively. Furthermore, 



according to Theorem 3.3, there exists an n-computation t for the word 61 ... 6^ of height 
at most 3|M|. Let a be the value of t. Since E is an idempotent, it is closed under product 
and stabilisation and contains bi, . . . ,bk. It follows that a £ E (by induction on the height 
of t). The induction hypothesis holds using the witness n-computation t{ti, . . . ,tk} where 
t{ti, . . . , tk} is obtained from t by substituting the ith leaf for ti for all i = 1 . . . k. 

Stabilisation node, i.e., T = F[Ti, . . . , T^] for k > n'^'^^l where Ti, . . . , all share the same 
idempotent value E such that E'^ C F. Let ti, . . . ,tk be the n-computations of respective 
values bi, . . . ,bk obtained by applying the induction hypothesis on Ti, . . . ,Tk respectively. 



Furthermore, according to Theorem 3.3, there exists an n-computation t for bi...bk of 
height at most 3|M|. Since t has height at most 3\M\ and has more than n^l*^l leaves, it 
contains at least one node of degree more than n, namely, a stabilisation node. It is then 
easy to prove by induction on the height of t that the value of t belongs to {E)'^~^ . Thus 
the n-computation t{ti, . . . ,tk} (as defined in the above case) is a witness for the induction 
hypothesis. □ 

Lemma 4.12. is a stabilisation monoid. 

Proof. Consider an n-under-computation T of value A in Mj^ for some word Ai . . . A^ of 
height at most p, and some a(n)-over-computation T' for the same word of value B (with 



a(n) = a'(n)^l*^l where a' is obtained from Theorem 3.4 applied to M for height at most 
3p). We aim at A < B. Indeed, this implies that one can use Proposition [sj and get that 
is a stabilisation semigroup (it is then straightforward to prove it a stabilisation monoid. 



Let a £ A, we aim at a G -B, thus proving A <^ B, i.e., A < B. By Lemma 4.10 



there exists an n-under-computation for some word ai . . .at with ai £ Ai, . . . ,ak £ A^, of 
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height at most 3p and value a. Applying Lemma 4.11 on T', there exists an a'(n)-over- 
computation for the same word ai . . .at of value h ^ B. Then applying Theorem |3.4[ we 
get a < b. Hence a £ B since B is downward closed. □ 

We can now establish the result of this section. 



Proof of Theorem ^.7, Assume a function / over the alphabet A is recognised by M, h,I, 
and let z be some mapping from A to IB . Let H be the mapping from IB to which to 
6 E IB associates h{z^^{b))\., and let K C be 

K = {J gM^ : J^I} . 

We shall prove that M^, H, K recognise the cost function of /inf,^- There are two directions. 
Consider a word u = hi . . .b^ over alphabet B such that 

This means that there exists an a(n)-under-computation over Ai . . . = H(u) of value 
J € \ K, and height at most 3|M||. By definition of K, this means that there exists 



some a G J\I. Let us apply now Lemma 4.10 and obtain an n-under-computation for some 



oi . . . Ofc of value a ^ A, and of height at most d = 9|Mj^|, where ai G Ai for all i = 1 ... A:. 
By definition of H, there exists Cj G A such that z{ci) = bi and /i(cj) = a^. Thus, set 
V = ci . . . Ck- The word v is such that z{v) = u and [M, h, (f ) < n. This a witness that 

([M,/i,/]^-)inf,.(u) <n . 

Conversely, consider a word u over the alphabet IB such that 

[M^,i/,Kl++^|(n)>a(n) + l, 

where a is the polynomial obtained from Lemma |4.11[ This means that there exists an 
a(n)-over-computation for the word Ai . . . A^ = H[u) of height at most 3|M|| and value 
J £ K. Consider now some word v = c i . . .C k over the alphabet A such that z{v) = u. Let 



oi . . . Ofc = h{v). according to Lemma 4.11 there exists an n-computation for ai . . .ak of 
height at most d = 9|M||M^| and value a € J. Since by definition oi K, J C /, this means 
that a ^ I. Thus, this computation witnesses that [M, > n. Since this this holds 
for all words v such that z{v) = u, we obtain that 

([M,/i,/]+)inf,.(n) >n . 

□ 



4.6. Closure under sup-projection. We now establish the closure under sup-projection, 
using a proof very similar to the previous section. 

Theorem 4.13. Recognisable cost functions are effectively closed under sup-projection. 

The closure under sup-projection follows the same principle as the closure under inf- 
projection. It uses also a powerset construction. However, since everything is reversed, 
this is a construction of co-ideals. A co-ideal is a subset of a stabilisation monoid which is 
upward closed. Let be set of co-ideals over a given stabilisation monoid M. Given a set 
A, let us denote by A'l the least co-ideal containing A, i.e., A'l = {y : y > x £ A}. One 
equips M^- of an order by: 

I <J if IDJ, 
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of a product with: 

A-B = {a-b : a£A, be 5}t , 
and of a stabilisation operation by: 

= (^)f+t . 

Let us call M-j- the resulting structure. You can remark that {E)^~^"[ = (-E')''t- This was 
not the case for ideals. The proof is extremely close to the case of inf-projection. However, 
a careful inspection would show that all computations of bounds, and even some local 
arguments need to be modified. 

Let us state a simple remark on the structure of idempotents. 

Lemma 4.14. If E is an idempotent in M|, then for all a G -E there exist b,c,e € E with 
e idempotent such that a > b ■ e ■ c. 

Proof. As E = E ■ ■ ■ E, for all n, there exist oi, . . . , £ E such that a > oi • • • a„. Using 
Ramsey's theorem, for n sufficiently large, there exist 1 < i < j < n such that ai - ■ ■ aj = e 
is an idempotent. One sets b = ai---aj_i, c = aj+i---a„. We have 6, c, e G E, and 
a > b ■ e ■ c. □ 

Our second preparatory lemma is used for the treatment of stabilisation nodes. 

Lemma 4.15. There exists a polynomial a such that for all idempotents E of M^, all a ^ E^ 
and all a(m) < n, there exists an m-over-computation of height at most 2\M\ + 3 of value 
a over some word over E of length n. 

Proof. It is sufficient to prove the result for a single pair E, a, and construct for each such 
case a polynomial aE,a- Then, since there are finitely many such pairs {E,a), one can 
choose a polynomial a that is above all the aE,a- This a will witness the lemma for all 
choices of E and a. 

We first claim (*) that ii b £ E then for all n > 1, there exists a word Wn over E 
of length n such that for all m > 1 there exist an m-over-computation for w of height at 



most 3 of value b. Indeed, by Lemma 4.14, 6 > c • / • d where c,f,d belong to E, and / 
is idempotent. So if n = 1, we take w = b. If n = 2, we take w = c{f ■ d). Finally, for 
n > 3, there is a natural m-over-computation of value c • / • d of height 2 or 3 over the word 
c f ■ . . f d, which is of length n. 

n—2 times 

Consider now a (j-expression e of value a € .E" for some idempotent E. Without loss 
of generality, we can choose it of height at most 2|A^|. and consider the word Urn — 
unfold(e, |M|!(m -|- 1)) for all m > 1. The length of this word is a polynomial a(m), and 
there is an m-over-computation of height at most 2\M\ of value a for this word. Consider 
now some n > a{m). The word Um can be written vb for some b £ E. We can apply the 
above claim to b and n-|-l — a(m), yielding the word Wn+i-a{m)- Combining the two m-over- 
computations, we then naturally obtain an m-over-computation for the word vWn+i-a{m) 
of height at most 2\M\ +3 and of value a, and this word has length n. □ 

Lemma 4.16. There exists a polynomial a such that for all words Ai . . . over M^- and 
all a(n)-over-computations T for Ai . . . A^ of height at most p of value A and all a € A, 
there exists an n-over computation of height at most (2|M| -|- 3)p and value a for some 
word ai . . . Ofc with oi G Ai,. . . ,0^ G A^. 
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Proof. The proof is by induction on p. We take the polynomial a of Lemma 4.15 

Leaf case, i.e., T = Ai. Let a £ A <^ Ai, then a is an n-computation of value a £ A. 

Binary node, i.e., T = A[Ti,T2]. Let Bi and B2 be the respective values of Ti and T2. 
Let a £ A Bi ■ B2. By definition of the product, there exists 61 G Bi and 62 G B2 such 
that a > 61-62. By induction hypothesis, there exist n-computations ti and t2 of respective 
values 61 and 62. The n-over-computation a[ti,t2] satisfies the induction hypothesis. 

Idempotent node. T = F[ri, . . . , T^.] for some k < a{n) where F <Z E for an idem- 
potent E such that the value of Tj is E for all i. Let a £ F E. We have a > 



b ■ e ■ c for some 6, c, e £ E (Lemma 4.14). We then apply the induction hypothesis 



for 6, e, . . . , e and c on the computations Ti,...,Tk_i and respectively, yielding the 
n-under-computations ti, . . . ,tk-i and tk respectively. We conclude by constructing the 
n-under-computation a[ti, (e • c)[e[t2, • • • ,tk-i],tk]]- 

Stabilisation node. T = F[Ti, . . . ,Tfc] for some k > a{n) and F Q E'^ for some idempotent 



E such that the value of Tj is E for all i. Let a £ F Q EK According to Lemma 4.15 
there exists a word ai . . .a^ over E and an n-over-computation t for ai . . . of value a and 
height at most 2\M\ + 3. We then apply the induction hypothesis for each of ai, . . . ,ak 
with the computations Ti , . . . , T^. respectively. This yields n-over-computations ti, . . . ,tk 
respectively. We conclude by constructing the n-over-computation obtained by substituting 
in t the ith leaf with ti . D 

Lemma 4.17. Let Ai . . . A^ be a word over M-|- and T be an n-under-computation T 
for ^1 ... of height p and value A. For all words u = ai . . .ak with ai £ Ai, . . . ,ak £ A^, 
there exists an n-computation over oi . . . of value a £ A and of height at most 3|M|p. 

Proof. The proof is by induction on p. 

Leaf case, i.e., T = A and u = ai £ Ai C A. Hence ai is a computation satisfying the 
induction hypothesis. 

Binary node, i.e., T = A[Ti, T2] where Ti and T2 have respective values Bi and B2 such that 
Bi ■ B2 ^ A. One applies the induction hypothesis on T\ and T2, and get computations ti 
and t2, of respective values 61 G -Bi and 62 G -62- The induction hypothesis is then fulfilled 
with the n-computation (61 • 62)[ti,t2] of value 61 • 62 G A. 

Idempotent node, i.e., T = F[Ti, . . . , T^.] for k < n where Ti, . . . , share the same idem- 
potent value EOF. Let ti, . . . ,tk be the n-computations of respective values 61, . . . , 6^ 
obtained by applying the induction hypothesis on Ti, . . . , respectively. Furthermore, ac- 



cording to Theorem 3.3, there exists an n-computation t for the word 61 ... 6^ of height at 
most 2>\M\. Let a be the value of t. Since E is an idempotent, it is closed under product 
and contains 61, . . . , 6^. Since furthermore t does not contain any node of stabilisation, we 
obtain that a £ E (by induction on the height of t). We conclude using the n-computation 
t{ti, . . . ,tk} (obtained from t by substituting the ith leaf of t for ti) which satisfies the 
induction hypothesis. 

Stabilisation node, i.e., T = F\Ti, . . . ,Tk] for k > n where Ti, . . . ,Tk all share the same 
idempotent value E C F. Let ti, . . . ,tk he the n-computations of respective values 61, . . . , 6^ 
obtained by applying the induction hypothesis on Ti,...,Tk respectively. Furthermore, 



according to Theorem 3.3, there exists an n-computation t for 61 ... 6^ of height at most 
2>\M\ and value a. Since E^ is a sub-stabilisation monoid of M which contains 61, . . . , 6^, a 
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also belongs to EK Thus the n-computation t{ti, . . . , t^} satisfies the induction hypothesis. 

□ 

Lemma 4.18. M-f- is a stabilisation monoid. 

Proof. Consider an n-under-computation T of value A in Mi- for some word Ai . . . Ak of 
height at most p, and some a(n)-over-computation T' for the same word of value B (with 



a(n) = a'(n)+2 where a' is obtained from Theorem 3.4 applied to M for height at most 3p). 
We aim at ^4 < i?, which means AD B. Indeed, this implies that one can use Proposition |3j 
and get that M^- is a stabilisation monoid. 



Let 6 S 5, we aim at 6 G ^4, thus proving B (D A, i.e., A < B. By Lemma 4.16, there 



exists an a'(n)-over-computation for some word oi . . . Ofc with ai S Ai,...,ak G Ak, of 



height at most 3p and value b. Applying Lemma 4.17 on T, there exists an n-computation 



for the same word ai . . .a^ of value a ^ A. Then applying Theorem 3.4, we get a < b. 



Hence b ^ A since A is upward closed. □ 
We are ready to complete the proof of closure under sup-projection. 



Proof. Proof of Theorem 4.13 Assume a function / over the alphabet A is recognised 
by M, h, I and let z be some mapping from A to B. Construct H from B to M-j- that 
maps 6 G B to ^(z"i(6))t, and let C M,- be 

K = {J£M^ : Jn//0} . 

Let us prove that M-i-, H, K recognises the cost function of /sup^. 
Consider now a word u = bi . . .bt over B. Assume 

[Mt,i/,Kl++^|(n)>a(n) + l, 



where a is the polynomial from Lemma 4.16 This means that there is an Q!(n)-over 



computation for H{bi) . . . H(bk) of value J S of height at most 3|Mi-|. By definition of 



K, there exists a ^ J H I. Hence, by Lemma 4.16 , there exists an n-over-computation of 



height at most d = {2\M\ + 3)2|Mi-| of value a for some ai . . . with ai G H{bi), . . . ,ak G 
H{bk). By definition of H, this means that there exist Cj G A such that = h(ci) and 
z{ci) = bi for all z = 1 ... A;. The obtained word v = ci . . .Ck is such that z{v) = u, and 
[M, /i, /] > n. This witnesses that 

([M,/i,/]++)sup,.(n) >n . 

For the converse direction, consider a word u = bi . . .b^ over B such that 

This means that there exists an n-under-computation for Ai . . . A/. = H{bi . . . b^) of value 
J ^ K, and height at most 3|M|-|. Let v = ci . . . Cjt be some word over A such that 
z{v) = u. By definition of H, this means that a, = /i(cj) G H[bi) for all i = 1 ... A;. Thus, 
by Lemma |4.17[ there exists an n-computation for ai . . .a^ of value in a G J of height at 
most d = 9|M||Mi-|. Since J ^ K, this means that ,701 = 9. As a consequence a G M \ I . 
It follows that [M, /i, < n. Since this holds for all v such that z{v) = u, we get 

(|M,/i,/]^)sup,^(u) <n . 

□ 
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5. On the role of automata 

In this paper we have developed the algebraic and logical aspects of regular cost functions 
over finite words. More precisely, we have introduced a notion of logic, cost monadic logic, 
that is suitable for describing functions, and an algebraic notion of stabilisation monoid 
that is suitable for recognising functions up to an equivalence relation ~. We have shown 
that the logically defined functions could be translated into equivalent ones recognisable by 
stabilisation monoids. Decision procedures for several problems involving the existence of 
upper bounds for functions are derived from this translation. 

There could have been several other facets for approaching this theory, a very natural 
one being through automata. The automata theoretic presentation happens to be closer 
to the historical developments. Indeed, the study of distance automata [l8], and then of 
nested distance desert automata ^26| was the original motivation. Following ideas from 
[5], it is convenient to consider two dual forms of automata using counters, called B and 
S-automata. The first model computes a minimum over all runs of the maximal values 
taken by counters, and the second form computes a maximum over all runs of the minimum 
value taken by some counters at some identified places in the run. As for regular languages, 
these automata happen to have the same expressiveness for describing cost functions as 
stabilisation monoids. 

Technically, all the necessary material for proving the equivalence between automata 
and regular cost functions is already present in this paper. Indeed, in one direction, as it is 
classical for regular languages, automata can be seen as a special fragment of cost monadic 
logic. Thus, automata can only define regular cost functions. For the converse implication, 
it is easy to construct a B-automaton guessing under-computations, or an S-automaton 
guessing over-computations, and use it for describing a recognisable cost function. We have 
seen all the necessary material for establishing the correction of these approaches. 

Despite this strong connection, there are several reasons for not presenting automata 
in this document. 

A first reason is to emphasize the difference with the theory of regular languages. In 
the case of languages, the simplest way to show the decidability of monadic logic over 
words is to use automata. This is not the case anymore here. Proving the important results 
concerning B and S-automata (the central one being the equivalence between the two models, 
called the duality theorem) , is more complicated than developing the theory of stabilisation 
monoids. In fact, the simplest way to prove the duality theorem is to translate the (say) 
B-automaton into a stabilisation monoid, and only then into an S-automaton (though, some 
other techniques are possible). One explanation for this difference between the theory of 
regular languages and regular cost functions is that B-automata and S-automata cannot be 
determinised. For these reasons stabilisation monoids form a much simpler model. 

A second reason is that we could concentrate even more deeply on the model of stabil- 
isation monoid. In particular, we did not only develop stabilisation monoids for obtaining 
decision procedures (as all works using stabilisations were doing so far), but we proved that 
a suitably axiomatised notion of stabilisation monoid can be used to recognise a cost func- 
tion independently of the presence of any cost monadic formula, or any automaton. This is 
reminiscent of the proof in the theory or regular languages of infinite words that finite Wilke 
algebras can be translated in a unique way into w-semigroups. If we were only interested 



in decidability questions, the paper could be simplified, and the important Theorem 3.4 
omitted. 
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A third reason is that B-automata and S-automata, which may seem a bit useless under 
the Ught of the previous explanations, are in fact so important that they require a deep 
study on their own. The importance of automata does not stem from the question of 
decidability of cost monadic logic over words, but over trees (even finite) . Indeed, the 
situation is reminiscent from the case of regular languages of infinite trees. In this case, 
proving the decidability of monadic logic over infinite trees can only be achieved using 
infinite tree automata, and the proof makes also use of the ability to determinise automata 
over infinite words (see, e.g, the survey [H]). The situation is similar here, and what 
is important in the study of automata is to disclose a suitable variant of the notion of 
determinism, called history-determinism [6j, and to prove that B and S-automata can be 
made history-deterministic. These considerations are completely diverging from the content 
of this paper. 

Acknowkedgement. This paper has gained a lot from the discussion s with and reviewing 
from Achim Blumensath, Michael Vanden Boom, Denis Kuperberg and Christof Loding. I 
am also very grateful to the two anonymous reviewers for their constructive remarks and 
their very thorough reading of the document. 

References 

[1] Parosh Aziz AbduUa, Pavel Krcal, and Wang Yi. R-automata. In CONCUR, pages 67-81. Springer, 
2008. 

[2] Sebastian Bala. Regular language matching and other decidable cases of the satisfiability problem for 

constraints between regular open terms. In STACS, volume 2996 of Lecture Notes in Computer Science, 

pages 596-607. Springer, 2004. 
[3] Achim Blumensath, Martin Otto, and Mark Weyer. Boundedness of monadic second-order formulae 

over finite words. In 36th ICALP, Lecture Notes in Computer Science, pages 67-78. Springer, July 

2009. 

[4] Mikolaj Bojahczyk. Weak MSO with the unbounding quantifier. Theory Comput. Syst., 48(3):554-576, 
2011. 

[5] Mikolaj Bojahczyk and Thomas Colcombet. Bounds in oj-regularity. In LICS 06, pages 285-296, 2006. 
[6] Thomas Colcombet. The theory of stabilisation monoids and regular cost functions. In Automata, 

languages and programming. Part II, volume 5556 of Lecture Notes m Comput. Sci., pages 139-150. 

Springer, Berlin, 2009. 

[7] Thomas Colcombet. Factorization forests for infinite words and applications to countable scattered 

linear orderings. Theoret. Comput. Sci., 411(4-5):751-764, 2010. 
[8] Thomas Colcombet. Green's relations and their use in automata theory. In Adrian Horia Dediu, Shun- 

suke Inenaga, and Carlos Martm-Vide, editors, LATA, volume 6638 of Lecture Notes in Computer 

Science, pages 1-21. Springer, 2011. Invited lecture. 
[9] Thomas Colcombet and Christof Loding. The nesting-depth of disjunctive /i-calculus for tree languages 

and the limitedness problem. In Computer science logic, volume 5213 of Lecture Notes m Comput. Sci., 

pages 416-430. Springer, Berhn, 2008. 
[10] Thomas Colcombet and Christof Loding. The non-deterministic Mostowski hierarchy and distance- 
parity automata. In Automata, languages and programming. Part II, volume 5126 of Lecture Notes in 

Comput. Sci., pages 398-409. Springer, Berlin, 2008. 
[11] Thomas Colcombet and Christof Loding. Regular cost functions over finite trees. In LICS, pages 70-79, 

2010. 

[12] Frangoise Dejean and Marcel-Paul Schiitzenberger. On a question of Eggan. Information and Control, 
9(l):23-25, 1966. 

[13] Lawrence C. Eggan. Transition graphs and the star-height of regular events. Michigan Math. J., 10:385- 
397, 1963. 

[14] Costa Grahne and Alex Thomo. Approximate reasoning in semistructured data. In KRDB, 2001. 



46 



THOMAS COLCOMBET 



[15] Pierre A. Grillet. Semigroups. An introduction to the structure theory. Pure and Applied Mathematics, 

Marcel Dckkcr. 193. New York, NY: Marcel Dekker, Inc. ix, 398 p, 1995. 
[16] Kosaburo Hashiguchi. A decision procedure for the order of regular events. Theoretical Computer Sci- 
ence, 8:69-72, 1979. 

[17] Kosaburo Hashiguchi. Limitedness theorem on finite automata with distance functions. J. Comput. 

Syst. Set., 24(2):233-244, 1982. 
[18] Kosaburo Hashiguchi. Regular languages of star height one. Information and Control, 53(3):199-210, 

1982. 

[19] Kosaburo Hashiguchi. Representation theorems on regular languages. J. Comput. Syst. ScL, 27(1):101- 
115, 1983. 

[20] Kosaburo Hashiguchi. Relative star height, star height and finite automata with distance functions. In 

Formal Properties of Finite Automata a/rui Applications, pages 74-88, 1988. 
[21] Kosaburo Hashiguchi. Improved limitedness theorems on finite automata with distance functions. Theor. 

Comput. Sci., 72(l):27-38, 1990. 
[22] Kosaburo Hashiguchi. Algorithms for determining relative inclusion star height and inclusion star height. 

Theor. Comput. Sci.., 91(1);85-100, 1991. 
[23] Kosaburo Hashiguchi. New upper bounds to the limitedness of distance automata. Theor. Comput. Sci., 

233(1-2): 19-32, 2000. 

[24] Karel Culik II and Jarkko Kari. Image compression using weighted finite automata. In MFCS, pages 
392-402, 1993. 

[25] Daniel Kirsten. Desert automata and the finite substitution problem. In STAGS, volume 2996 of Lecture 

Notes in Computer Science, pages 305 316. Springer, 2004. 
[26] Daniel Kirsten. Distance desert automata and the star height problem. RAIRO, 3(39):455-509, 2005. 
[27] Daniel Kirsten. A burnside approach to the finite substitution problem. Theoretical Computer Science, 

39(l):15-50, 2006. 

[28] Daniel Kirsten. Distance desert automata and star height substitutions. Habilitation, Universitat 

Leipzig, Fakultat fiir Mathematik und Informatik, 2006. 
[29] Daniel Kirsten. On the complexity of the relative inclusion star height problem. Advances in Computer 

Science and Engineering, 5(2):173 211, 2010. 
[30] Manfred Kufleitner. The height of factorization forests. In MFCS, volume 5162, pages 443-454, 2008. 
[31] Gerard Lallement. Semigroups and Combinatorial Applications. Wiley, 1979. 

[32] Hing Leung. An Algebraic Method for Solving Decision Problems in Finite Automata Theory. PhD 
thesis, Pennsylvania State University, Department of Computer Science, 1987. 

[33] Hing Leung. On the topological structure of a finitely generated semigroup of matrices. Semigroup 
Forum, 37:273 287, 1988. 

[34] Hing Leung. Limitedness theorem on finite automata with distance functions: An algebraic proof. 

Theoretical Computer Science, 81(1):137-145, 1991. 
[35] Hing Leung and Viktor Podolskiy. The limitedness problem on distance automata: Hashiguchi's method 

revisited. Theoretical Computer Science, 310(1-3):147-158, 2004. 
[36] Robert McNaughton. The loop complexity of pure-group events. Information and Control, 11(1-2):167- 

176, 1967. 

[37] Mehryar Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 
23(2):269-311, 1997. 

[38] Mehryar Mohri, Fernando Pereira, and Michael Riley. Weighted finite-state transducers in speech recog- 
nition. Computer Speech & Language, 16(l):69-88, 2002. 

[39] Jean-Eric Pin. Varieties of Formal Languages. North Oxford Academic, London and Plenum, New York, 
1986. 

[40] Imre Simon. Limited subsets of a free monoid. In FOCS, pages 143-150. IEEE, 1978. 

[41] Imre Simon. Recognizable sets with multiplicities in the tropical semiring. In MFCS, volume 324 of 

Lecture Notes in Computer Science, pages 107-120. Springer, 1988. 
[42] Imre Simon. Factorization forests of finite height. Theoretical Computer Science, 72:65-94, 1990. 
[43] Imre Simon. On semigroups of matrices over the tropical semiring. RAIRO ITA, 28(3-4):277-294, 1994. 
[44] Imre Simon. A short proof of the factorization forest theorem. Tree Automata and Languages, pages 

433-438, 92. 



REGULAR COST FUNCTIONS, PART I: 



LOGIC AND ALGEBRA OVER WORDS 



47 



[45] Wolfgang Thomas. Languages, automata and logic. In G. Rozenberg and A. Salomaa, editors, Handbook 

of language theory, volume 3, chapter 7, pages 389-455. Springer Vorlag, 1997. 
[46] Szymon Toruhczyk. Languages of profinite words and the limitedness problem. PhD thesis, Warsaw 
University, 2011. 

[47] Andreas Weber. Distance automata having large finite distance or finite ambiguity. Mathematical Sys- 
tems Theory, 26(2):169-185, 1993. 
[48] Andreas Weber. Finite-valued distance automata. Theoretical Computer Science, 134(1):225-251, 1994. 



